Contents

All tags in alphabetic order (see their presence stats):

{"pdd"=>[#, #, #], "programming"=>[#, #, #, #], "movies"=>[#], "pets"=>[#, #, #, #, #, #, #, #, #, #, #, #, #], "phantomjs"=>[#, #], "testing"=>[#, #, #, #, #, #, #, #, #, #, #, #, #, #], "xembly"=>[#], "xml"=>[#, #, #, #, #, #, #, #], "xslt"=>[#, #, #, #, #], "xsd"=>[#], "xdsd"=>[#, #, #, #, #, #, #, #, #, #, #, #, #], "management"=>[#, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #], "jcabi"=>[#, #, #, #, #, #, #, #, #, #, #, #, #, #, #], "http"=>[#], "java"=>[#, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #], "dynamodb"=>[#, #, #], "aws"=>[#, #, #, #, #], "github"=>[#, #, #, #], "requs"=>[#], "requirements"=>[#, #], "hamcrest"=>[#], "xpath"=>[#], "maven"=>[#, #, #, #, #, #, #], "oop"=>[#, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #, #], "mysql"=>[#], "logging"=>[#, #], "slf4j"=>[#], "aop"=>[#, #, #, #], "casperjs"=>[#], "jekyll"=>[#, #], "ruby"=>[#, #, #, #], "restful"=>[#, #], "sass"=>[#], "pygments"=>[#], "liquibase"=>[#], "rultor"=>[#, #, #, #, #, #, #, #, #, #, #], "devops"=>[#, #, #, #, #, #, #, #, #, #, #, #, #, #, #], "docker"=>[#, #, #], "qulice"=>[#], "quality"=>[#, #, #, #], "book-review"=>[#, #], "ssh"=>[#], "heroku"=>[#], "tdd"=>[#, #, #, #], "architect"=>[#, #, #, #, #, #, #, #], "career"=>[#, #, #, #, #], "design"=>[#], "agile"=>[#, #, #, #, #, #, #, #, #, #, #, #, #, #], "mood"=>[#, #, #, #, #, #, #, #, #, #, #], "css"=>[#], "html"=>[#], "award"=>[#, #, #, #], "outsourcing"=>[#, #, #, #, #, #, #], "architecture"=>[#, #], "sarcasm"=>[#, #, #, #, #, #, #, #, #], "json"=>[#], "startup"=>[#, #, #, #, #, #], "conferences"=>[#], "blogging"=>[#], "tools"=>[#, #], "social"=>[#, #], "economics"=>[#], "zerocracy"=>[#, #, #, #, #, #, #, #], "refactoring"=>[#], "maintainability"=>[#]}

Intensity of writing: approx. 241,585 words in the entire blog, 11,110 unique ones, written in 26 geographic places.

Here is my agenda for the nearest future. This is a full list of 289 blog posts published:

Sort by: date | comments | length
Code Must Be Clean. And Clear.

QR code

Code Must Be Clean. And Clear.

  • Odessa, Ukraine
  • comments

There is a famous book by Robert Martin called Clean Code. The title is an obvious call to all of us: the code must be clean. Clean, like a kitchen, I suppose---there are no dirty dishes, no garbage on the floor, no smelly towels. Dirt to be cleaned in a code base, according to Martin, includes large methods, non-descriptive variable names, tight coupling, lack of SOLID and SRP compliance, and many other things. Read the book, it's worth it. However, there is yet another aspect of source code. How clear is it?

The Rum Diary (2011) by Bruce Robinson
The Rum Diary (2011) by Bruce Robinson

The kitchen is clean when there is no dirt in the oven. But if its electric panel speaks French, I can't use the kitchen. Even if it's perfectly clean. It's not clear how to use it---that's why it's useless.

The metaphor applies to the source code. Making it clean is the first and very important step, which will remove all those coding anti-patterns so many books speak about, including my favorite Code Complete by Steve McConnell, Working Effectively With Legacy Code by Michael Feathers, and Clean Code. A very important step, but not the most important one. A dirty kitchen that is useful is better than a clean one that I can't use, isn't it?

Making code clean but leaving it difficult to understand by others is the pitfall most of us fall for. By others I mean everybody, from our fellow in-project co-developers sitting next to us at the same desk, to imaginative junior contributors who will join the project in five years after we're all hired by Google. All of them, across this very large time frame, must be able to use the kitchen source code without any additional help. The oven has to speak their language. Not the language of its designer.

How do you do that? How do you make sure the code is clear, not just clean?

Well, test it. Ask someone who is outside of the project to take a look at your code and tell you how clear it is. Not how beautiful your classes and code constructs are---that's what makes it clean. Instead, ask someone to fix a bug in just 30 minutes and see how they react. You will realize how clear the code is and whether it speaks the language a stranger can understand.

This is the definition of maintainability. If a stranger can modify your code and fix a bug in less than an hour, it is maintainable. Obviously, cleanliness will help. But it's not enough. There has to be something else, which I don't really know how to describe. The only way to achieve it is to let strangers regularly see your code, attempt to make a contribution, and report bugs when something is not clear.

Making your code open and encouraging programmers to report bugs when something is not only broken but unclear---are the best two ways to achieve high maintainability.

© Yegor Bugayenko 2014–2018

Monolithic Repos Are Evil

QR code

Monolithic Repos Are Evil

  • Moscow, Russia
  • comments

We all keep our code in Git version control repositories. The question is whether we should create a new repository for each new module or try to keep as much as possible in a single so called "monolithic" repo. Market leaders, like Facebook and Google, advocate the second approach. I believe they are wrong.

Funny Games (2007) by Michael Haneke
Funny Games (2007) by Michael Haneke

Let's use the following JavaScript function as an example. It downloads a JSON document from a Zold node (using jQuery) and places part of its content on the HTML page. Then it colors the data according to its value.

// main.js
function main() {
  $.getJSON('http://b1.zold.io/', function(json) {
    var $body = $('body');
    $body.text(json.nscore);
    var color = 'red';
    if (json.nscore > 500) {
      color = 'green';
    }
    $body.css('color', color);
  });
}

Pretty obvious, isn't it? Just a single main.js file which does everything we need. We simply add it to the HTML and it works:

<html>
  <head>
    <script src="https://code.jquery.com/jquery-3.3.1.min.js"/>
    <script src="main.js"/>
  </head>
  <body onload="main();">loading...</body>
</html>

Now, let me refactor it. Let me break it into two pieces. The first piece will load the data and the second one will be a jQuery plugin to colorize HTML content according to the data it contains. This is how the plugin will look:

// colorize.js
$.fn.colorize = function() {
  var data = parseFloat(this.text());
  var keys = Object.keys(colors)
    .map(function (k) { return parseInt(k); })
    .sort(function (a,b) { return a - b; })
    .reverse();
  for (i = 0; i < keys.length; ++i) {
    var max = keys[i];
    if (data >= max) {
      this.addClass(colors[max]);
      return;
    }
    this.removeClass(colors[max]);
  }
  return this;
}

The main.js will look like this:

// main.js
function main() {
  $.getJSON('http://b1.zold.io/', function(json) {
    $('body')
      .text(json.nscore)
      .colorize({ 500: 'green', 0: 'red' });
  });
}

Now, instead of a single monolithic piece of code, we have two smaller pieces which have to be loaded together into the target HTML:

<html>
  <head>
    <script src="https://code.jquery.com/jquery-3.3.1.min.js"/>
    <script src="colorize.js"/>
    <script src="main.js"/>
  </head>
  <body onload="main();">loading...</body>
</html>

Two pieces are better than one? It seems that Google, Digital Ocean and Mozilla don't think so.

I disagree.

To illustrate my point I extracted the JavaScript function into a new standalone jQuery plugin. Here is what I did:

  • Created a new GitHub repo yegor256/colorizejs;
  • Read the instructions;
  • Did some research of jQuery plugins, studied a few examples;
  • Found out that most of them used Gulp, which I've never heard of;
  • Decided to use npm for JavaScript packaging (what else, right?);
  • Created package.json for npm;
  • Renamed GitHub repo to colorizejs when I found out that npm package colorize already exists;
  • Configured .travis.yml for Travis;
  • Created a README.md and explained how to use it and install it;
  • Decided to use the MIT license and created LICENSE.txt;
  • Configured PDD for puzzles automated collection;
  • Configured .rultor.yml for Rultor;
  • Tried to create a unit test and failed miserably (after a full hour of research), since I've had almost no experience in JS unit testing;
  • Posted a question to StackOverflow;
  • The question was answered by a few people only after the bounty I offered;
  • @brian-lives-outdoors's answer was the best and he even submitted a pull request with a unit test, which I merged;
  • Released the first version 0.0.1 to npmjs.com;
  • Modified the code to make it work both with classes and colors;
  • Implemented and released the next version 0.1.0;
  • Added it to Zold front-end, tested it, and released it---check it out here.

It took almost three weeks of waiting and four hours of work, just to move a small piece of JavaScript code to a new repository and release it separately. Was it worth it? Well, I think it was. But many most other blog post authors, who I managed to find, think that it would be better to keep everything in a single monolithic repo, mostly because it's better for productivity. For example, Advantages of monorepos by Dan Luu, Advantages and Disadvantages of a Monolithic Repository (a case study at Google) by Ciera Jaspan et al., and How Monolithic Repository in Open Source saved my Laziness by Tomas Votruba.

There are also a few good analyses of both approaches, for example Monolithic repositories vs. Many repositories speech by Fabien Potencier at dotScale 2016 and Repo Style Wars: Mono vs Multi by Peter Seibel.

In a nutshell, they all claim that productivity is higher with a monolithic repo because the amount of operations one has to do in order to make a change is smaller. Indeed, in a monorepo there will be a single branch, a single set of commits, a single pull request, a single merge, deploy and release. Also it will be easier to test, both manually and via unit testing. Continuous integration is easier to configure, and so on and so forth.

All these "reasonable" arguments remind me of what I hear when preaching object decomposition and suggesting that multiple objects are better than a single large one. Imagine a large class of 3,000 lines of code, which does many things and they are all very tightly coupled. It's "easy" to test it, to make changes, to deploy, to review, etc. Because everything stays in one file, right? We don't need to jump from class to class in order to understand the design. We just look at one screen, scroll it up and down, and that's it. Right? Totally wrong!

I guess I don't need to explain why it's wrong. We don't design our software that way anymore. We know that tight coupling is a bad idea. We know that a set of smaller components is better than a larger solid piece.

Why can't we apply the same logic to repositories? I believe we can. Of course, just like in object-oriented programming, a fine-grained design requires more skills and time. Look at what I had to do with this small jQuery plugin. I've spent hours of coding and thinking. I even had to learn Gulp and Jasmine, which I most probably will not use anymore. But the benefits we are getting from it are enormous. This is my short list of them:

  • Encapsulation. Each repo encapsulates a single problem, hiding its details from everybody else. Thanks to that, the scope each repo has to deal with gets smaller. The smaller the scope, just like in OOP, the easier it is to maintain and modify. The easier to maintain, the cheaper the development. I guess Google guys don't really worry about costs. On the contrary, they want their salaries to grow. A large unmaintainable monolithic repo is a perfect tool to make it happen.

  • Fast Builds. When a repo is small, the time its automated build takes is small. Look at the time Travis spends for my jQuery plugin. It's 51 seconds. It's fast. We all know that the faster the build, the better it is for productivity, since it's easier to use the build as a tool for development.

  • Accurate Metrics. I don't know whether you rely on metrics in your projects, but we at Zerocracy do pay attention to numbers, like lines of code, hits of code, number of commits, classes, methods, cohesion, coupling, etc. It's always a question whether the metrics are accurate. Calculating lines of code for a large repository doesn't make any sense, since the number will include a lot of files from completely different parts of the application. Moreover there will be different languages and file formats. Say a repo has 200K lines of Java, 150K lines of XML, 50K lines of JavaScript, and 40K lines of Ruby. Can you say something specific about this repo? Is it large? Is it a Java repo? And, more importantly, can it be compared with other repositories? Not really. It's just a big messy storage of files.

  • Homogeneous Tasks. Smaller repositories tend to have smaller tech stacks, meaning that each of them uses just a few languages and frameworks, or (and this is the preferred situation)---one language or technology per repository. Thanks to this, the management of programmers becomes easier, since any ticket/problem can be assigned to anybody. It's easier to make tasks similar in size and complexity. This obviously means better manageability of the project.

  • Single Coding Standard. It's easier to standardize the coding style if the repo is small. When it's large, various parts of the code base will have different styles and it will be almost impossible to put everybody on the same page. In other words, smaller repositories look more beautiful than larger ones.

  • Short Names. Each repository, inevitably, will have its own namespace. For example, in the JS repository I just created, I only have two files: colorizejs.js and test-colorizejs.js. I don't really care about the naming inside them, since the namespace is very small. I can even use global variables. Shorter names and smaller namespaces mean better maintainability.

  • Simple Tests. The larger the code base, the more dependencies it has, which are difficult to mock and test. Very large code bases become fundamentally untestable since they require a lot of integration tests which are difficult to maintain. Smaller libraries, frameworks and modules are easier to keep at the level of simple and fast unit testing.

Thus, I believe that the smaller the repositories and modules, the better. Ideally, I would say, the largest acceptable size for a code base is 50,000 lines of code. Everything that goes above this line is a perfect candidate for decomposition.

© Yegor Bugayenko 2014–2018

Soft Skills Demystified

QR code

Soft Skills Demystified

  • Moscow, Russia
  • comments

There are tech skills and there are soft skills. Every programmer knows that. Tech skills are about algorithms, operators, classes, objects, and everything else they teach us in tech schools. Soft skills are about something else. What exactly? Difficult to say. Let's try to clear the air.

Glengarry Glen Ross (1992) by James Foley
Glengarry Glen Ross (1992) by James Foley

Here is a non-exhaustive list of soft skills I managed to find on the Net1, 2, 3, 4, 5, 6: empathy, open-mindedness, a willingness to learn, effective communication, teamwork, approachability, helpfulness, listening, patience, responsibility, critical thinking, problem solving, mentoring, attunement, clarity, curiosity, strategizing, lifelong learning, business mindedness, work ethic, judgment, ego management, commitment, accountability, creativity, adaptability, big-picture thinking. Phew!

Do you really understand exactly what these words mean? I don't.

I would actually suggest we replace them all with the "do the right thing" mantra and call it a day.

I have my own list of soft skills though. I also strongly believe that tech skills are just a small part of what makes a good programmer, but being empathical and ready to learn is not what the other part consists of. Of course, smiling in the office and not cursing at a stupid boss---helps. But this is not what soft skills are about.

They are about our ability to exchange deliverables. Tech skills produce deliverables, soft skills turn them into a final product, which is working software. Here is a short list, in no particular order:

Drawing. Talking works great when you're discussing your next vacation with your partner. In a software team your ability to explain your thoughts with a diagram seriously increases your usefulness.

Writing. Again, talking is great, when your chaotic team is being managed by an hysterical boss who just read a book about unconditional love. In a more disciplined environment your ability to put your thoughts in writing do make a difference for mutual success.

Reporting. A good programmer knows not only how to fix a bug but, more importantly, how to report it the right way, so that the project benefits. An ability to describe a technical problem in simple words is a crucial soft skill.

Volunteering. Open source is an important part of any software project. You have to know how to work with an open source community, by giving them something back for the software they provide. Sometimes you will have to report problems to them, sometimes even submit pull requests, and maybe even create your own open source products. You will need a lot of non-tech skills to do that.

Charging. Programmers make money by writing code. Very often projects fail because important people quit due to a money conflict. They don't know how to resolve that, how to manage their financial objectives, how to ask for a raise, or how to change the paying schedule. I blame programmers for that. We, technical people, have to know how to manage our financial relationship with our projects.

Relaxing. Many projects fail because its programmers burn out. This happens, very often, because they don't know how to manage their time right: when to work and when to relax. Again, I blame programmers. We have to know how to manage our own peace of mind.

Asking. Not your friends, but StackOverflow and other public sources. The software development world is getting global and the knowledge your project team possess is just a tiny fraction of what the world knows about the problem you are solving. You have to know how to ask the world. This is the soft skill you need to have to be a good developer.

Tweeting. Here comes your ability to share your thoughts and achievements in social networks. If you stay mute and net-social-phobic, you are not really helping your project. This is the skill you won't learn in a few days. I would suggest you take a look at my 256 Bloghacks book.

Testing. Here I mean not only writing automated tests, which is a tech skill, but an ability to communicate with testers, to make sure their feedback improves the quality of the software under development. There is a well known developer-tester conflict, which good programmers know how to deal with.

Branching. Still working in a single master branch? Still an amateur. You have to learn how to use multiple branches, how to resolve conflicts between them, and what is the difference between merge and rebase. This is a soft skill, since it doesn't have anything to do with the quality of your code but it seriously affects your professionalism as a software developer.

Failing. Most projects fail, one way or another. Technical failures are not the primary source of our troubles. We fail due to management incompetence most frequently. Good programmers know how to deal with failures, by provoking (aka Fail Fast), predicting (aka Risk Management), and embracing them.

Delivering. Continuous integration, delivery pipeline, build automation, staging, green/blue deployments, etc.---if you think that all these things concern the DevOps department only, you are wrong. You have to understand how your lines of code reach your users. The bigger the product, the longer the pipeline, the more people it involves, and the more soft skills it requires to be smooth.

Intriguing. Any project is a part of a bigger political game, one way or the other. If you isolate yourself from intrigue, claiming that your job is to write code---you are not a good programmer. A good programmer understands where the money is coming from, who the primary shareholders are, and which ass to kiss and when.

Did I forget anything important?

© Yegor Bugayenko 2014–2018

Builders and Manipulators

QR code

Builders and Manipulators

  • Palo Alto, CA
  • comments

Here is a simple principle for naming methods in OOP, which I'm trying to follow in my code: it's a verb if it manipulates, it's a noun if it builds. That's it. Nothing in between. Methods like saveFile() or getTitle() don't fit and must be renamed and refactored. Moreover, methods that "manipulate" must always return void, for example print() or save(). Let me explain.

The Night Of (2016) by Richard Price et al.
The Night Of (2016) by Richard Price et al.

First, I have to say that this idea is very similar to the one suggested by Bertrand Meyer in his book Object Oriented Software Construction, where he proposes we divide an object's methods into two sharply separated categories: queries and commands.

The idea behind this principle is rather philosophical. Let's start with builders, which are supposed to create or find an object and then return it. Suppose I have a store of books and I ask it to give me a book by name:

interface Bookshelf {
  Book find(String title);
}

It's obviously a "builder" (or a "query" in Meyer's terms). I ask for a book and it's given to me. The problem, though, is with the name of the method. It's called "find," which implies that I know how the book will be dealt with. It will be found.

However, this is not how we should treat our objects. We must not tell them how to do the job we want them to do. Instead, we must let them decide whether the book will be found, constructed, or maybe taken from a memory cache. When we query, we have to say what result we are looking for and let the object make the decision about the way this result is going to be built. A much more appropriate name for this method would be book():

interface Bookshelf {
  Book book(String title);
}

The rule of thumb is: a builder is always a noun. If the method returns something, it has to be a noun. Preferably its name should explain what the method returns. If it's a book, name it book(). If it's a file, call the method file(), etc. Here are a few good builder examples:

interface Foo {
  float speed(Actor actor);
  Money salary(User user);
  File database();
  Date deadline(Project project, User user);
}

Here, on the contrary, are a few examples of badly named builders:

interface Foo {
  float calculateSpeed(Actor actor);
  Money getSalary(User user);
  File openDatabase();
  Date readDeadline(Project project, User user);
}

There is no place for a verb in a builder's name!

It's not only about the name, by the way. A builder, since its name doesn't contain a verb, should not do any modifications to the encapsulated entities. It may only create or find something and return it. Just like a pure function, it must not have any side-effects.

Next, there are "manipulators" (or "commands" in Meyer's terms). They do some work for us, modifying the entities, which the object encapsulates. They are the opposite to builders, because they actually make changes to the world abstracted by the object. For example, we ask the Bookshelf to add a new book to itself:

interface Bookshelf {
  void add(Book book);
}

The method adds the book to the storage. How exactly the storage will be modified, we don't know. But we know that since the name of the method is a verb, there will be modifications.

Also, manipulators must not return anything. It's always void that we see as the type of their response. This is needed mostly in order to separate the imperative part of the code from the declarative part. We either receive objects or tell them what to do. We must not mix those activities in one method.

The purpose of these rules is to make the code simpler. If you follow them, and all your builders only return objects and your manipulators only modify the world, the entire design will become easier to understand. Methods will be smaller and their names shorter.

Of course, very often you will have a hard time finding those names. From time to time you will want to return something from a manipulator or make your builder make some changes, say to the cache. Try to resist this temptation and stay with the principle: a method is either a builder or a manipulator, nothing in the middle. The examples above are rather primitive, the code in real life is much more complicated. But that's what the principle is going to help us with---making the code simpler.

I'm also aware of the noun/verb principle, which suggests always naming classes as nouns and their methods as verbs. I believe it's a wrong idea, since it doesn't differentiate builders from manipulators and encourages us to always think in terms of imperative instructions. I believe that OOP must be much more about declarative composition of objects, even if we have to sometimes get them from other objects instead of instantiating them via constructors. That's why we do need builders in most situations and we also have to see an obvious difference between them and the other methods, manipulators.

You can find a more detailed discussion of this problem in Elegant Objects, Volume 1, Section 2.4.

© Yegor Bugayenko 2014–2018

70/70

QR code

70/70

  • Moscow, Russia
  • comments

A few days ago, a friend of mine asked me to join him in a new startup. He said that he needed a partner, to help him partially finance the project, promote it, bring in new ideas, and push the business forward. I liked the business idea and wanted to participate. I started to ask questions about our future partnership and attempted to draw a simple partnership agreement. He quickly got offended. He said that he was looking for a real partner, who would share his goals and would never require any papers or lawyers.

Lola rennt (1998) by Tom Tykwer
Lola rennt (1998) by Tom Tykwer

Here is a list of the questions I asked him:

  • What happens with my shares if I stop working?
  • Can I start a similar business with another partner later?
  • Who will have a checkbook and bank account access?
  • Who has final say if we disagree?
  • Do I need your permission to sell my shares to a friend?
  • Will there be any vesting schedule?
  • Who approves expenses?
  • Who controls the domain name registrar account?
  • Who pays if the project needs more money?

For the majority of people I attempt to do business with, these questions are indeed offensive. They feel like I'm trying to draw a prenup between us, while they would prefer our "marriage" to be based solely on love. They would have us promise to never betray each other's mutual interests but then the business suffers and eventually collapses.

Why so? Because it's impossible to be equal in business (just like it's impossible to be equal in marriage, but that's a topic for another blog post). Eventually a conflict of interests shows up and the inability to resolve it affects and damages the business. "Co-founder disputes have historically been one of the top reasons startups fail at the earliest possible stage," says Garry Tan.

I would call that type of partnership, which many young startup founders enter into, "70/70." For them the initial agreement is not a formula of future growth, but a symbol of respect. Each of them wants to be respected and that's why they demand 70% of the business for themselves. They don't care and don't understand how exactly the distribution of assets and profit will work in the future. They don't foresee forthcoming conflicts and fights, mostly because they are too young or naive (or both). According to the The Founder's Dilemma by Noam Wasserman, "Founders' attachment, overconfidence, and naïveté may be necessary to get new ventures up and running, but these emotions later create problems."

They just want to have 70%... each!!!

It's not even 50/50 in most cases, but 70/70. This is what makes it funny. Very often each partner thinks that he or she is the major contributor to the project and that's why it's 70%.

And they have it.

When I was younger, I had no problem with entering into such an agreement. I was saying to myself: he wants 70% but he doesn't understand that in reality he won't even get a single percent, since, for example, we didn't put the ownership of the domain name into the contract. It is obvious right from the start that the partner is ready to trust me blindly, without even thinking about possible consequences. In the end, all of us, and our startups, were losing.

Don't get me wrong, trust between partners is a crucial and mandatory element of any deal. But you can't trust me to stand by my words, if there were no words. If we never discussed, for example, how much I get when I quit because I'm fed up with you, I will try to get as much as possible and where exactly that line of 70% goes will be a subject for an ugly debate. And this will happen in very different circumstances, where there will be, most probably, no love between us or maybe even friendship. In that future fight the one who is the most aggressive and cunning will win. Do you really want that to be me? Or do you expect it to be you?

Now I'm a bit older and more experienced. I've seen those ugly debates and don't want them anymore. I don't want the investment of my time and my money to go down the drain. I want to know upfront what exactly will happen with them in a month, a year, and a few decades, when I retire and my grandchildren inherit them.

That's why I ask all those "offensive" questions upfront. If the other side does indeed get offended, I attempt to explain to them what I just said in this article. If they don't understand and still want to marry me out of love, I walk away. As Paul Graham, a co-founder of Y Combinator, said, "Most of the disputes I've seen between founders could have been avoided if they'd been more careful about who they started a company with." I don't want to start a business with those who are scared of a prenup. I know that in that case the divorce will be ugly.

© Yegor Bugayenko 2014–2018

Either Bugs or Pull Requests ... or You Are Out

QR code

Either Bugs or Pull Requests ... or You Are Out

  • Moscow, Russia
  • comments

Here is how it goes, over and over again. I find a new developer for one of my projects managed by Zerocracy. He claims to be an expert with 10 years of hands-on coding experience, $60 hourly rate (we don't hire US guys), and fluent English. Then he joins the project and attempts to close a few tickets. But he hardly can. For many reasons. Then he comes back and explains why our microtasking methodology doesn't work, trying to convince me that I have to pay him per hour, instead of per result. Here is my answer.

Jamón, Jamón (1992) by Bigas Luna
Jamón, Jamón (1992) by Bigas Luna

No matter how bad the methodology is, you do know that we pay for each bug that is found and properly reported, right? Check §29 of our Policy.

If the Code Base Is Bad, Why Don't You Report Bugs?

If the Code Base Is Good, Where Are Your Pull Requests?

There is only one metric on our projects, which separates good programmers from bad ones: the amount of money they are making. You can make money contributing to the project either by 1) reporting bugs (when you see problems) or 2) submitting pull requests (when you don't see problems).

If none of that works for you, you are a bad programmer.

Good bye.

© Yegor Bugayenko 2014–2018

What's Wrong With Global Variables?

QR code

What's Wrong With Global Variables?

  • Moscow, Russia
  • comments

Only lazy people haven't written already about how global variables are evil. It started in 1973 when W. Wulf et al. claimed that "the non-local variable is a major contributing factor in programs which are difficult to understand." Since then, many other reasons where suggested to convince programmers to stop using global variables. I think I read them all, but didn't find the one that bothers me most of all: composability. In a nutshell, global variables make code difficult or impossible to compose in ways which are different from what its original author expected.

El Chapo, Season 1 (2017) by Silvana Aguirre et al.
El Chapo, Season 1 (2017) by Silvana Aguirre et al.

I was recently writing a web front for Zold in Ruby, on top of Sinatra. This is how a web server starts according to their documentation:

App.start!

Here start! is a static method of the App class, which you have to declare as a child of their default parent Sinatra::Base. To tell the app which TCP port to listen to you have to preconfigure it:

require 'sinatra/base'
class App < Sinatra::Base
  get '/' do
    'Hello, world!'
  end
end
App.set(:port, 8080)
App.start!

What do you do if you want to start two web servers? For the purpose of testing this may be a pretty logical requirement. For example, since Zold is a distributed network, it is necessary to test how a number of servers communicate to each other. I can't do that! There is absolutely no way. Because Sinatra is designed with the assumption that only one server may exist in the entire application scope.

Can this really be fixed? Let's take a look at their code. Class Sinatra::Base is essentially a Singleton, which is not supposed to have more than one instance. When we call App.set(:port, 8080), the value 8080 is saved into an attribute of a single instance. The number 8080 becomes available for all methods of Sinatra::Base, no matter what instance they are called from.

They are not using true Ruby global variables, I believe, because they know that they are bad. Why exactly they are bad and what the alternatives are---slipped through their fingers.

Technically speaking, their design is "globally scoped." Sinatra::Base treats the entire application as its scope of visibility. No matter who calls it, everything is visible, including what was created in previous calls and in previously instantiated objects. This "class" is a giant bag of global variables.

Every global variable is a troublemaker of that kind. While the application is small and its test coverage is low, global variables may not hurt. But the bigger the app and the more sophisticated its automated testing scenarios, the more difficult it will be to compose objects which depend on global variables, singletons, or class variables.

My recommendation? Under no circumstances even think about any global variables.

© Yegor Bugayenko 2014–2018

Are You an Architect?

QR code

Are You an Architect?

  • Moscow, Russia
  • comments

Over twenty five years ago, in 1992, at an OOPSLA workshop in Vancouver, Kent Beck, in answer to the question "What is an architect?" said, according to Philippe Kruchten, that it is "a new pompous title that programmers demand to have on their business cards to justify their sumptuous emoluments." Not much has changed since then. There is a big difference between a smart programmer and a project architect. Here is a list of traits that, I believe, a good architect has.

No Country for Old Men (2007) by Coen Brothers
No Country for Old Men (2007) by Coen Brothers

Disclaimer: Even though I haven't seen a single female software architect in my life, I have to say for my leftist/feminist readers that in this blog post I'm assuming an architect is a man only for the sake of convenience of speech. There is no intention to offend anyone.

He Is Loyal.

Programmers come and go. They are, as I mentioned many times before, egoists with a strong focus on their personal profit. They change projects, they work on multiple projects at the same time, they have no personal attachments to any piece of code. They worry only about their individual tasks and feature branches. The branch is merged? All bets are off. Professional developers are "polygamous" and disloyal.

An architect, however, is a different creature. He stays with the project even after it runs out of funds, loses the last programmer, and proves that the architecture is a total mess that can't handle even a fraction of the traffic it was supposed to work under. The architect stays and says "No worries, we'll get through, no matter what!" How to find such a guy and how to motivate him are different questions, maybe for another blog post.

He Is Disciplined.

Design patterns, quality of code, static analysis, unit testing, high performance, reliability, security and even maintainability are all very important things to worry about. However, a good architect knows that all these can be resolved and achieved by programmers, if they are properly hired, motivated, organized and controlled. How to hire, motivate, organize and control them---that's what a good architect worries about.

He knows that process comes first, people next.

However, this is not what most software experts think. For example, according to Alistair Cockburn's article Agile Software Development: The People Factor published in IEEE Computer in 2001: "If the people on the project are good enough, they can use almost any process and accomplish their assignment. If they are not good enough, no process will repair their inadequacy---'people trump process' is one way to say this." It is acceptable if a programmer thinks like that, but not an architect.

An architect puts discipline on top of everything else, constantly inventing new rules and enforcing them. Moreover, he is not only making others obey, but also following the rules himself. Here, for example, are the rules to enforce:

Each project has its own set of rules. The list above is a subset of what we have on our projects at Zerocracy. A good architect thinks about the rules first and about the architecture second.

I totally agree with Len Bass that "the architecture should be the product of a single architect," as he said in his book Software Architecture in Practice. The question, however, is how exactly the architect will create the product: either in solo mode, making all technical decisions alone, or letting the team contribute in an organized and disciplined manner. The former is easy but less effective, the latter is way more difficult, but leads to much stronger solutions and better team synergy (I hate this word, but here it fits well).

He Is Strong.

Matthew McBride said in his article The Software Architect, published in CACM in 2007, that "Without strong supervision from the software architect, projects and attempted solutions tend to fall apart due to the weight of unmitigated complexity." The word strong is what is important to emphasize here.

What does strength mean in this context? An ability to stay in the office two days straight with just pizza and cola? An ability to multiply six-digit numbers in memory? An ability to memorize the purpose and design of all classes and methods? An ability to stay in a meeting with investors for three hours without checking Facebook even once? Not likely.

The strength of an architect is in the ability to say "No" when it's difficult to do so. For example:

  • "No, I will not merge your pull request";
  • "No, we will not implement this feature";
  • "No, you do not deserve a promotion yet";
  • "No, your code is not as good as we expect";
  • "No, this build is not stable enough to be released";
  • "No, you will not go on vacation this month."

There are many other instances of "No" which can easily turn an architect into a hated figure, but this is what his job is: to be the bad guy. This is why he has to be strong---to handle it all calmly and continue leading the project forward, toward his own well-defined technical goals.

He Is Abstract.

Abstract thinking is a very important positive trait of an architect. Programmers may lack that, since they are mostly focused on their own isolated tasks. An architect must think globally and see the product as a whole. Details are less important. He must rely on his people when talking about details.

He Is Social.

Software is a product of people. No matter how great the architect is, if he can't find the right people to implement his ideas and to bring back new ideas, he is doomed to fail. The key quality of the architect is the ability to work with people: recruit, motivate, and control their results. Social skills are what an architect needs in order to be successful in that, especially in finding new programmers and engaging them on the project. What exactly does this mean? Well, here are some examples:

  • High visibility in social networks;
  • A long list of previous projects and teams;
  • Active membership in professional groups;
  • Publicity in the blogosphere.

In other words, a good architect is the one with a big group of followers and supporters around him. I mentioned that in my recent talk How Much Do You Cost? at JEEConf 2017.

He Is Brave.

A good architect says many times a day: "It is my fault." If an architect doesn't have a habit of saying that frequently, he is not a good architect. He is just a programmer who is afraid of responsibility and authority.

The golden rule of a good manager is: "Success is yours, faults are mine." This is the attitude a good architect has to express to his team. When they win, he will always find a way to celebrate and reward them. When they fail, he will take full responsibility for the failure. Because it's his team, he found them, he motivated them, he controlled them, and he didn't punish them properly. That's why they failed. First of all, it's his fault.

What will he do with this fault is a separate question. Maybe he will train and coach someone, maybe he'll enforce some rules more aggressively, maybe he will even give someone his card. It's up to the architect. But for the outside world he will always be the guilty one and the team must know that. If they know that, they will do everything to not let the architect down.

He Is Simple.

"Simplicity is a great virtue," said Edsger Dijkstra in 1984. For a programmer it's a virtue, for an architect it's a survival skill. An architect who can't explain his ideas in simple words, easily understood by other programmers, is not an architect. No matter how smart he is, no matter how bright his ideas are. If they can't be delivered in a simple form, they are worth nothing.

"If I don't understand you, it's your fault" said Yegor Bugayenko in 2015. A good architect remembers that.

He Is Coding.

Anthony Langsworth in his piece Should Software Architects Write Code? argues in favor of code-writing architects and in particular says that "Understanding code means the architect can use his or her judgment more effectively rather than rely on which developer is more persuasive." Indeed, an architect that is only capable of talking and drawing is a weak architect that will sooner or later let the team and the project down.

How much code the architect has to write, depends on the age of the project. When the project is young and is still in the phase of prototyping, the architect produces the majority of code. Then, later, when the product matures, the architect steps away and mostly reviews the contribution of programmers. Eventually, when the project migrates into the maintenance phase, the architect may quit the project and transfer his responsibilities to one of the programmers.

He Is Ambitious.

An architect does want to get something in addition to money. He wants to be the smartest guy in the room, he wants to solve complex tasks nobody else has been able to solve before, he wants to save the world. He wants all of that to be appreciated and rewarded. He wants to be number one. In most cases he fails miserably. But he always gets back on his feet and tries again. Look for the guy with ambitions if you want to hire an architect, not just yet another programmer.

Michael Keeling, in his recent book Design It!: From Programmer to Software Architect (worth reading), says: "On some teams, architect is an official team role. On other teams, there is no explicit role and teammates share the architect's responsibilities. Some teams say they don't have an architect, but if you look closely, someone is fulfilling the architect's duties without realizing it. If your team doesn't have an architect, congratulations, you've got the job!"

Michael's point is that the architect's position is rarely given to someone voluntarily. Instead, an architect has to fight for it and demand it. Sometimes even going straight ahead and saying "I want to be the architect!"

What is important is that it will not sound like "I want to architect this." That would be the voice of a programmer, not an architect. An architect wants to be a man of power, not just a smart technical engineer. So, it's way more about a title for him, rather that just his actual responsibilities.

He Is Expensive.

Yes, the money question again. A good architect is expensive. If he is not, he is not a good architect.

© Yegor Bugayenko 2014–2018

Simplified GitHub Login for a Ruby Web App

QR code

Simplified GitHub Login for a Ruby Web App

  • Moscow, Russia
  • comments

You know what OAuth login is, right? It's when your users click "login" and get redirected to Facebook, Twitter, Google, or some other website which then identifies them. Then they go back to your website and you know who they are. It's very convenient for them. It's convenient for you too, since you don't need to implement the login functionality and don't need to keep their credentials in a database. I created a simple Ruby gem to simplify this operation for GitHub only. Here is how it works.

The Savages (2007) by Tamara Jenkins
The Savages (2007) by Tamara Jenkins

First, you will have to register your application in GitHub, as this page explains. This is how it works with Sinatra, but you can do something similar in any framework.

First, somewhere in the global space, before the app starts:

require 'glogin'
configure do
  set :glogin, GLogin::Auth.new(
    # You get this from GitHub, when you register your
    # web application:
    client_id,
    # Make sure this value is coming from a secure
    # place and is NOT visible in the source code:
    client_secret,
    # This is what you will register in GitHub as an
    # authorization callback URL:
    'http://www.example.com/github-callback'
  )
end

Next, for all web pages we need to parse a cookie, if it exists, and convert it into a user:

require 'sinatra/cookies'
before '/*' do
  if cookies[:glogin]
    begin
      @user = GLogin::Cookie::Closed.new(
        cookies[:glogin],
        # This must be some long text to be used to
        # encrypt the value in the cookie:
        encryption_secret
      ).to_user
    rescue OpenSSL::Cipher::CipherError => _
      # Nothing happens here, the user is not logged in.
      cookies.delete(:glogin)
    end
  end
end

If the glogin cookie comes in and contains valid data, a local variable @user will be set to something like this:

{ login: 'yegor256', avatar: 'http://...' }

Next, we need a URL for GitHub OAuth callback:

get '/github-callback' do
  cookies[:glogin] = GLogin::Cookie::Open.new(
    settings.glogin.user(params[:code]),
    # The same encryption secret that we were using above:
    encryption_secret
  ).to_s
  redirect to('/')
end

Finally, we need a logout URL:

get '/logout' do
  cookies.delete(:glogin)
  redirect to('/')
end

One more thing is the login URL you will need for your front page. Here it is:

settings.glogin.login_uri

For unit testing you can just provide an empty string as a secret for GLogin::Cookie::Open and GLogin::Cookie::Closed and the encryption will be disabled: whatever comes from the cookie will be trusted. For testing it will be convenient to provide a user name in a query string, like this:

http://localhost:9292/?glogin=tester

To enable that, it's recommended you add this line (see how it works in zold-io/wts.zold.io):

require 'sinatra/cookies'
before '/*' do
  cookies[:glogin] = params[:glogin] if params[:glogin]
  if cookies[:glogin]
    # same as above
  end
end

I use this gem in sixnines, 0pdd, and Zold web apps on top of Sinatra (all open source).

© Yegor Bugayenko 2014–2018

Object Validation: to Defer or Not?

QR code

Object Validation: to Defer or Not?

  • Moscow, Russia
  • comments

I said earlier that constructors must be code-free and do nothing aside from attribute initialization. Since then, the most frequently asked question is: What about validation of arguments? If they are "broken," what is the point of creating an object in an "invalid" state? Such an object will fail later, at an unexpected moment. Isn't it better to throw an exception at the very moment of instantiation? To fail fast, so to speak? Here is what I think.

Punching the Clown (2009) by Gregori Viens
Punching the Clown (2009) by Gregori Viens

Let's start with this Ruby code:

class Users {
  def initialize(file)
    @file = file
  end
  def names
    File.readlines(@file).reject(&:empty?)
  end
}

We can use it to read a list of users from a file:

Users.new('all-users.txt').names

There are a number of ways to abuse this class:

  • Pass nil to the ctor instead of a file name;

  • Pass something else, which is not String;

  • Pass a file that doesn't exist;

  • Pass a directory instead of a file.

Do you see the difference between these four mistakes we can make? Let's see how our class can protect itself from each of them:

class Users {
  def initialize(file)
    raise "File name can't be nil" if file.nil?
    raise 'Name must be a String' unless file.is_a?(String)
    @file = file
  end
  def names
    raise "#{@file} is absent" unless File.exist?(@file)
    raise "#{@file} is not a file" unless File.file?(@file)
    File.readlines(@file).reject(&:empty?)
  end
}

The first two potential mistakes were filtered out in the constructor, while the other two---later, in the method. Why did I do it this way? Why not put all of them into the constructor?

Because the first two compromise object state, while with the other two---its runtime behavior. You remember that an object is a representative of a set of other objects it encapsulates, called attributes. The object of class Users can't represent nil or a number. It can only represent a file with a name of type String. On the other hand, what that file contains and whether it really is a file---doesn't make the state invalid. It only causes trouble for the behavior.

Even though the difference may look subtle, it's obvious. There are two phases of interaction with the encapsulated object: connecting and talking.

First, we encapsulate the file and want to be sure that it really is a file. We are not yet talking to it, we don't want it to work for us yet, we just want to make sure it really is an object that we will be able to talk to in the near future. If it's nil or a float, we will have problems in the future, for sure. That's why we raise an exception from the constructor.

Then the second phase is talking, where we delegate control to the object and expect it to behave correctly. At this phase we may have other validation procedures, in order to make sure our interaction will go smoothly. It's important to mention that these validations are very situational. We may call names() multiple times and every time have a different situation with the file on disc. To begin with it may not exist, while in a few seconds it will be ready and available for reading.

Ideally, a programming language should provide instruments for the first type of validations, for example with strict typing. In Java, for example, we would not need to check the type of file, the compiler would catch that error earlier. In Kotlin we would be able to get rid of the NULL check, thanks to their Null Safety feature. Ruby is less powerful than those languages, that's why we have to validate "manually."

Thus, to summarize, validating in constructors is not a bad idea, provided the validations are not touching the objects but only confirm that they are good enough to work with later.

© Yegor Bugayenko 2014–2018

One More Recipe Against NULL

QR code

One More Recipe Against NULL

  • Moscow, Russia
  • comments

You know what NULL is, right? It's evil. In OOP, your method can return NULL, it can accept NULL as an argument, your object can encapsulate it as an attribute, or you can assign it to a variable. All four scenarios are bad for the maintainability of your code---there are no doubts about that. The question is what to do instead. Let's discuss the "return it" part and I will suggest one more "best practice" on top of what was discussed a few years ago.

Snatch (2000) by Guy Ritchie
Snatch (2000) by Guy Ritchie

Look at this code:

Integer max(List<Integer> items) {
  // Calculate the maximum of all
  // items and return it.
}

What should this method do if the list is empty? Java's Collections.max() throws an exception. Ruby's Enumerable.max() returns nil. PHP's max() returns FALSE. Python's max() raises an exception. C#'s Enumerable.Max() also throws an exception. JavaScript's Math.max() returns NaN.

Which is the right way, huh? An exception, NULL, false or NaN?

An exception, if you ask me.

But there is yet another approach, which is better than an exception. This one:

Integer max(List<Integer> items, Integer def) {
  // Calculate the maximum of all
  // items and return it. Returns 'def' if the
  // list is empty.
}

The "default" object will be returned if the list is empty. This feature is implemented in Python's max() function: it's possible to pass both a list and a default element to return in case the list is empty. If the default element is not provided, the exception will be raised.

© Yegor Bugayenko 2014–2018

An Open Code Base Is Not Yet an Open Source Project

QR code

An Open Code Base Is Not Yet an Open Source Project

  • Dnipro, Ukraine
  • comments

A few weeks ago someone suggested I should try to integrate IntelliJ IDEA's static analysis rules into Qulice, our aggregator of Checkstyle, PMD, FindBugs, and some other analyzers. I do love IDEA's rules---some of them are unique and very useful. I asked whether I could find them somewhere in Maven Central (they are written in Java) and the answer was "You'll have to figure out yourself how to use them, but they are open source." Here comes my opinion about this situation: I believe that open source doesn't just mean the code is readable without authorization. It means something much bigger.

Her (2013) by Spike Jonze
Her (2013) by Spike Jonze

Just making a piece of code publicly accessible is not what it takes to call it open source software. Actually, it only harms the product, and the reputation of its author, if it's open but not ready for reuse (which is what the open source world is all about). As Eric Raymond said in his famous piece The Cathedral and the Bazaar, "Good programmers know what to write. Great ones know what to rewrite (and reuse)."

It's the responsibility of the software product's author to help those "good" programmers to reuse the code. Coding, testing, debugging, and making sure "it works on my laptop" is one thing. Making it readable and reusable is a totally different piece of work, which may take much more time.

As Karl Fogel said in Producing Open Source Software: "Most free software projects fail." They fail (on top of many other factors) because not enough attention is paid to the following basic things (in no particular order):

README. I'm sure you host your product on GitHub. (If not, what's wrong with you?) There must be a README.md file in the root directory that explains what the product is all about and how we should use it. A few good examples: leejarvis/slop, mongobee/mongobee ronmamo/reflections, and yegor256/takes (this one is mine). A few bad examples: qos-ch/slf4j, rzwitserloot/lombok, and junit4/blob (don't be like these guys).

No matter how rich you've made your website, Javadoc, Wiki, mailing list, and Twitter, the README is the place where we expect to see everything. Only if and when we get interested will we investigate further and deeper. Read the README files in other projects and copy their best ideas. README is your showcase, it must shine.

License. Most of us don't pay attention to this bureaucracy. I didn't either, until recently. I thought that the moment my code is open I can forget about any rights and royalties. They will just use my code and I won't see any profit, ever. The license I attach to it won't matter---nobody reads it anyway. This is exactly what happens in most cases. But only while those users are small potatoes.

A few years ago I was an architect on a software project and we had to create an analyzer of hardware components, like CPU, memory, hard disc, etc. We had to make sure all of them worked as expected after running pretty complex and customized tests. My obvious suggestion was to use open source tools, which would do the hard work for us. We would only have to integrate them. It was an awesome idea, until some of us decided to check the licenses of those tools.

That was the moment I realized that I was so wrong for not paying attention to what licenses say. GPL, for example, which we found in a few tools, didn't allow us to reuse the code if our product wasn't open source too. Since we were creating proprietary software, we understood that we weren't able to use copyleft modules, only MIT, BSD or similar.

I'm suggesting you think about the license before publishing the product. I've used MIT in all my products since 2016.

Distribution. A mere collection of .rb files is not reusable Ruby code. Well, maybe for those hackers I despise so much, it is. But for professional developers, who are too lazy to read their own code, let alone someone else's, it definitely isn't.

"Take it from GitHub" is not a polite way to treat us---your fellow programmers---anymore. It was, twenty years ago, but now we have repositories. You have to distribute your product as an "artifact" through one of those public repositories, and make it possible for us to fetch it from there, skipping the testing and packaging, and just using it as a product (a Ruby gem or Java JAR, for example).

I'm talking about repositories like Maven Central, npmjs, or RubyGems. You have to find a way to deploy your product there. It's not an easy task, even though those repositories do their best to simplify the process. We use Rultor in all our projects, which helps us streamline the deployment:

Package managers like Maven, NPM, Rake, Grunt, Gradle and others, are the standard and traditional way of reusing open source software (proprietary too). If your product is not available in a public repository, it's not a product; it's just a code base.

Javadoc. We all hate writing documentation. And we hate libraries that are not documented. I usually find it boring to write Javadoc blocks for my classes, but I understand that without them the code I'm writing inside those classes will not interest anyone.

The best format for those Javadoc blocks is "by example." Instead of prose I'd recommend you demonstrate how to use the class, especially in combination with it neighbors. Moreover, I'd suggest you don't write documentation anywhere else apart from those Javadoc blocks. (They exist in other languages too, but have different names.)

The problem with Javadoc is that it's not so easy to format the text so that it looks visually attractive. Maybe that's why many programmers still rely on Wikis or project websites. I'd recommend you stay inside Javadoc blocks and learn their formatting syntax.

Badges. As you can see, I like badges. First and foremost they make a repository look as if it's being "actively maintained," especially if those badges are green. They don't really deliver any valuable information. They mostly say: "Our author has very good taste, see how perfectly our colors match!"

Jokes aside, it's not so easy to add all those badges. Each badge will take you some time, to integrate a third party system, to make sure the numbers are good enough to be proud of, and to keep it under control. If the repository is not being watched over, the badges will eventually start failing.

Continuous Integration. In order to use your code we have to trust it, meaning that we have to be sure that it works, or at least passes automated tests. (Do I have to say that you must have tests?) How can we be sure it works? CI is the answer. We must be able to see the logs of the recent CI build and make sure it is clean.

It's a matter of trust. You may never use those Travis builds and simply ignore their red and green signals, but they are important for us---your clients. I add Travis badges to all projects of mine, right after I create a new repository.

Contribution Guidelines. For a regular GitHub addict it's not a problem to figure out how to send you a pull request. However, the majority of us, at least initially, will consist of active users, not contributors. We will try to use your product and will attempt to customize it for our needs. If we get lost, we will leave, frustrated.

To prevent this, you have to explain what a disciplined contributor has to do in order to make changes to your code base. Here are the questions I'd recommend you answer in your CONTRIBUTING.md:

  • How do I run an automated build?
  • How big/small does a pull request have to be in order to be accepted?
  • What are your style guidelines?
  • How do bugs have to be reported, tagged, explained?
  • What makes a good bug report?

Here is the text I use in all my projects: ISSUE_TEMPLATE.md and PULL_REQUEST_TEMPLATE.md.

Quality Wall. Finally, if you are lucky, we will use your product and will be interested in contributing. You will start getting our pull requests. The question is how fast we will ruin your code base. We will, if you don't protect yourself.

If you strictly review each pull request and reject anything that doesn't look like "great" code, you will lose us, your contributors. We don't want to write great code, we want to make changes to your product so that it becomes more suitable for our needs. The greatness of the code is your concern, not ours.

On the other hand, if you accept whatever comes in, the architecture will lose its robustness (if it ever had any) and you again will lose us, your contributors. This time you will lose us because the product will become bad and difficult to maintain and contribute to.

The best way to keep the balance is to "hire" a tool to help you: build automation, static analysis, automated tests, and coverage control. You have to configure the product to fail when the changes someone introduces violate its internal quality expectations. I use Rultor for that too.

Did I forget anything?

© Yegor Bugayenko 2014–2018

The Right Way to Report a Bug

QR code

The Right Way to Report a Bug

  • Moscow, Russia
  • comments

You know, at Zerocracy, either you are a programmer or a tester, and we pay for each bug you find and report. Well, not quite. We pay for each bug report a project architect considers good enough to pay for. The architect's decision is totally subjective and non-disputable, according to §29 of the Policy. Some of our developers find this unfair and ask me to explain how they can report bugs such that they are definitely paid. Here is a non-exhaustive list of my recommendations.

Burn After Reading (2008) by Coen Brothers
Burn After Reading (2008) by Coen Brothers

To be honest, there are many articles written before on this very subject. I will try not to repeat them. They mostly say reasonable things, like "be specific," "choose a strong title," "avoid duplicates," and many others. My recommendations here are more of a psychological nature.

Stay cool. Don't expect all of your bugs to be accepted and paid for. Some of them won't be. This must not stop you from reporting them.

Exaggerate. No matter how minor the bug is, present it as if the entire world will collapse tomorrow if they don't fix it. Of course, they will make their own decision about the priority and severity of the bug, but don't help them to make it against you.

Victimize yourself. Don't just say "the class is broken"---there is no victim in this statement. So, no need to save anyone's life. The bug is minor---no need to pay. Instead, say "I can't use the class." Present yourself as a victim. Or even better, represent a group of victims: "Nobody can really use this class."

Push them. If a bug report is not paid for, don't hesitate to ask why. Insist that it was a very important problem and you deserve to be paid. If they still don't pay, forget it and move on. You must not look like it offends you somehow.

Show efforts. The bug description must look "rich," clearly demonstrating that you invested a lot of effort in its creation. If there is just a single line, it's easier for them to not pay you---they won't feel any guilt. However, if it's long, detailed, properly formatted, and contains multiple supporting links, they will feel bad if they don't pay.

Look engaged. Say something like "I'm ready to investigate more and provide additional details, if you need me too." Of course you won't do that (in most cases), but you have to say it. This will look like you care and this bug comes right from your heart. How can they not pay for it?

Look altruistic. Don't show them that you are reporting these bugs just to get money. They know that anyway, but still. Look like you care about the project and honestly want to help. Say that you worry about the users, about the market, about the mission, about the bigger scope, etc.

Aggregate. This may sound against the principles of bug tracking I suggested earlier, but when your bugs are small and cosmetic---aggregate them. In such a case you have a chance to win. They will reject three minor bugs, but they won't reject a bigger one with three minor parts.

I believe that if you follow these simple recommendations, you will be a more successful bug reporter. At least at Zerocracy.

© Yegor Bugayenko 2014–2018

How to Be Lazy and Stay Calm

QR code

How to Be Lazy and Stay Calm

  • Moscow, Russia
  • comments

What frustrates me most in my profession of software development is the regular necessity to understand large problem scopes before fixing small bugs, especially if the code is legacy and not mine. Actually, it's even more frustrating when the code is mine. The "deep thinking," as they call it, which is always required before even a small issue can be resolved, seriously turns me away from programming. Or did turn me away. Until I started to think differently and encourage myself to be lazy. Here is how.

Sin City (2005) by Frank Miller
Sin City (2005) by Frank Miller

I wrote about this a few years ago in this blog post: How to Cut Corners and Stay Cool. However, in our Telegram group, where we talk about Zerocracy, some programmers keep asking me the same question over and over again: What should I do when the project is absolutely new to me, I have just 30 minutes, and the bug is very complex?

One of the core principles of Zerocracy is #NoAltruism. This literally means that you should always and only think about yourself and your personal profit. You should not try to improve the project, to increase its quality, to fix the code, or to refactor anything... unless you are paid for it.

First of all, when the task, which you are going to be paid for, is in front of you and you can't understand how to solve it, don't blame yourself. You are not supposed to be an expert in the legacy code you just opened up. Strictly speaking, you are not supposed to be an expert in anything. A project, unlike your mom, doesn't expect you to be intelligent or tech-savvy. It needs you to close tickets.

Who do you blame, if not yourself, when the bug is serious, the code is messy, and you have no idea how much time it will take just to understand it, let alone fix it? Well, you can blame everybody around you, but first of all you should blame the code base itself. How do you blame it? You report its low quality by creating new tickets, which may sound like this:

  • "The class X is not sufficiently documented, I don't understand how it works."
  • "The method X is too complex, I don't know what it does."
  • "The algorithm X is messy, I can't figure out what it does."
  • "The library X is used here, but I don't understand why you don't use library Y."
  • "The rules of class naming are not clear, document them please."
  • "The principle of data organization is not obvious, document it."

However, don't make the mistake many programmers are making when we tell them that tickets are the only right way to solve problems. They start asking questions and seeking help in the tickets, just like this:

  • "How can I unit test class X, please explain."
  • "Please help me create class X."
  • "Where should I put class X, in which package?"
  • "Which library should I use for doing X?"

The project is not a school, it's not interested in making you smarter or more of an expert in its code. Nobody will explain anything to you, because it's a waste of money and time. What the project will do instead is fix its code base so that it becomes cleaner and more obvious for you and everybody else. Thus, never ask for explanation or help, ask for documentation and source code fixes.

What do you do next? You sit and wait, until those tickets are resolved. Who will resolve them? You don't care. That's a problem for the project manager. Maybe he/she will even assign those tickets back to you and it will be your problem to resolve them. But if that happens, the scope of work will be different for you. You won't need to fix the bug anymore, you will have to document some functionality or refactor some module.

You will have other problems in this new and smaller scope. You will create new tickets, blaming everybody around you, and they also may come back to you. And so on and so forth. Ultimately, the scope of a ticket will be as small as it's possible to fix in 30 minutes.

See the algorithm? I'm sure you do, but it's very difficult to apply it to real life and real software projects, for a few obvious psychological reasons:

  • You are ashamed. You are trained to feel guilty when you are not smart enough. What can I say? Just stop it!

  • You are a perfectionist. You want to complete the entire ticket, solve the entire problem, and understand the entire scope. What can I say? This won't be solved while the project continues to pay you by the hour/month. Once they start paying for results, this disease will be cured.

  • You have no passion. You just don't care about the quality of code at all. You don't want it to look clean, you can't even tell what clean is or what messy is. You just want them to pay you by the end of the month. In this case you won't even know what tickets to report. What can I say? I guess you have to try and find another job. Maybe a manager?

  • You are afraid. Blaming the project and reporting tickets may look like you have a negative attitude towards the code base, and people who created it, which is not true. Instead, your attitude is positive, since you care about it and want it to get better. What can I say? Make your tickets sound extremely polite and gentle. But keep reporting them.

  • You have no time. You have to solve the problem now and you have no time to wait for the resolution of those complaints you reported. What can I say? Blame the management and require more time. Much more time. But never blame yourself.

Software development is perfect territory for cutting corners, being lazy and remaining calm, because our work is often discrete and can be very incremental. Very occasionally it might not be possible to blame the project and put the ticket on pause until all your complaints are addressed. I can't imagine such a situation though. If you can, please let me know.

© Yegor Bugayenko 2014–2018

Nine Steps of Learning by Refactoring

QR code

Nine Steps of Learning by Refactoring

  • Moscow, Russia
  • comments

I was asked on Twitter recently how is it possible to refactor if one doesn't understand how the code works. I replied that it is "learning by refactoring." Then I tried to Google it and found nothing. I was surprised. To me refactoring seems to be the most effective and obvious way to study the source code. Here is how I usually do it, in nine object-oriented steps.

Dom Hemingway (2013) by Richard Shepard
Dom Hemingway (2013) by Richard Shepard

According to Wikipedia, code refactoring is "the process of restructuring existing computer code---changing the factoring---without changing its external behavior." The goal of refactoring is to make code more readable and suitable for modifications.

badge

Martin Fowler in his famous book Refactoring: Improving the Design of Existing Code suggested a number of refactoring techniques which help making code simpler, more abstract, more readable, etc. Some of them are rather questionable from an object-oriented standpoint---like Encapsulate Field, for example---but most of them are valid.

Here is what I'm usually doing when I don't know the code, but need to modify it. The techniques are sorted by the order of complexity, starting with the easiest one.

Remove IDE Red Spots

When I open the source code of Cactoos in IntelliJ IDEA, using my custom settings.jar, I see something like this:

The figure

When I open the source code of, say, Spring Boot, I see something like this (it's o.s.b.ImageBanner randomly picked out of a thousand other classes that look very similar):

The figure

See the difference?

The first thing I do, when I see someone else's code, is to make it "red spots free" for my IDE. Most of those red spots are easy to remove, while others will take some time to refactor. While doing that I learn a lot about the crap program I have to deal with.

Remove Empty Lines

I wrote some time ago that empty lines inside method bodies are bad things. They are obvious indicators of redundant complexity. Programmers tend to add them to their methods in order to simplify things.

This is a method from the Apache Maven code base (class RepositoryUtils picked at random, but almost all other classes are formatted the same way):

The figure

Aside from being "all red" their code is full of empty lines. Removing them will make code more readable and will also help me understand how it works. Bigger methods will need refactoring, since without empty lines they will become almost completely unreadable. Hence, I compress, understand, and make them smaller mostly by breaking them down into smaller methods.

Make Names Shorter

I'm generally in favor of short one-noun names for variables and one-verb names for methods. I believe that longer "compound" names are an indicator of unnecessary code complexity.

For example, I found this method registerServletContainerInitializerToDriveServletContextInitializers (69 characters!) in the o.s.b.w.e.u.UndertowServletWebServerFactory class in Spring Boot. I wonder why the author skipped the couldYouPlease prefix and the otherwiseThrowAnException suffix.

Jokes aside, such long method names clearly demonstrate that the code is too complex and can't be explained with a simple register or even registerContainer. It seems that there are many different containers, initializers, servlets, and other creatures that need to be registered somehow. When I join a project and see a method with this name I'm getting ready for big trouble.

Making names shorter is the mandatory refactoring step I take when starting to work with foreign or legacy code.

Add Unit Tests

Most classes (and methods) come without any documentation, especially if we are talking about closed-source commercial code. We are lucky if the classes have more or less descriptive names and are small and cohesive.

badge
badge

However, instead of documentation I prefer to deal with unit tests. They explain the code much better and prove that it works. When I don't understand how the class works, I try to write a unit test for it. In most cases it's not possible, for many reasons. In such a case I try to apply everything I learned from Working Effectively With Legacy Code by Michael Feathers and Growing Object-Oriented Software, Guided by Tests by Steve Freeman and Nat Pryce. Both books are pretty much focused on this very problem: what to do when you don't know what to do, testing-wise.

Remove Multiple Returns

I wrote earlier that the presence of multiple return statements in a single method is not something object-oriented programming should encourage. Instead, a method must always have a single exit point, just like those functions in functional programming.

Look at this method from the o.s.b.c.p.b.Binder class from Spring Boot (there are many similar examples there, I picked this one randomly):

The figure

There are five return statements in such a small method. For object-oriented code that's too much. It's OK for procedural code, which I also write sometimes. For example, this Groovy script of ours has five return keywords too:

The figure

But this is Groovy and it's not a class. It's just a procedure, a script.

Refactoring and removing multiple return statements definitely helps make code cleaner. Mostly because without them it's necessary to use deeper nesting of if/then/else statements and then the code starts to look ugly, unless you break it down into smaller pieces.

Get Rid of NULLs

NULLs are evil, it's a well-known fact. However, they are still everywhere. For example, there are 4,100 Java files in Spring Boot v2.0.0.RELEASE and 243K LoC, which include the null keyword 7,055 times. This means approximately one null for every 35 lines.

To the contrary, Takes Framework, which I founded a few years ago, has 771 Java files, 154K LoC, and 58 null keywords. Which is roughly one null per 2,700 lines. See the difference?

The code gets cleaner when you remove NULLs, but it's not so easy to do. Sometimes it's even impossible. That's why we still have those 58 cases of null in Takes. We simply can't remove them, because they are coming from the JDK.

Make Objects Immutable

As I demonstrated some time ago, immutability helps keep objects smaller. Most classes that I see in the foreign code I deal with are mutable. And large.

If you look at any artifact analyzed by jpeek, you will see that in most of them approximately 80% of classes are mutable. Moving from mutability to immutability is a big challenge in object-oriented programming, which, if resolved, leads to better code.

This refactoring step of make things immutable is purely profitable.

Remove Static

Static methods and attributes are convenient, if you are a procedural programmer. If your code is object-oriented, they must go away. In Spring Boot there are 7,482 static keywords, which means one for every 32 lines of code. To the contrary, in Takes we have 310 static-s, which is one every 496 lines.

Compare these numbers with the statistics about NULL and you will see that getting rid of static is a more complex task.

Apply Static Analysis

This is the final step and the most complex one. It's complex because I configure static analyzers to their maximum potential or even more. I'm using Qulice, which is an aggregator of Checkstyle, PMD, and FindBugs. Those guys are strong by themselves, but Qulice makes them even stronger, adding a few dozen custom-made checks.

The principle I use for static analysis is 0/100. This means that either the entire code base is clean and there are no Qulice complaints, or it's dirty. There is nothing in the middle. This is not a very typical way of looking at static analysis. Most programmers are using those tools just to collect "opinions" about their code. I'm using them as guides for refactoring.

Check out this video, which demonstrates the amount of complaints Qulice gives for the spring-boot-project/spring-boot sub-module in Spring Boot (the video has no end, since I lost my patience in waiting):

When Qulice says that everything is clean, I consider the code base fully ready for maintenance and modifications. At this point the refactoring is done.

© Yegor Bugayenko 2014–2018

Fully Transparent Donations via Zerocracy

QR code

Fully Transparent Donations via Zerocracy

  • Moscow, Russia
  • comments

Open source is free, as in beer: you write code, nobody pays you. Of course, there are many ways to monetize your efforts, but there will be no direct cash payments from your users, usually. There are ways to collect money, which include an obvious "tip jar" button on your GitHub project page. The chances anyone will pay are low though. In my opinion, this is mostly because nobody trusts you enough---they are not sure you will use the money to make the product better. Most likely you will just spend it and nothing will change. But they want the product, not to make you happier. At least that's what I feel when I see a Patreon button.

Last Tango in Paris (1972) by Bernardo Bertolucci
Last Tango in Paris (1972) by Bernardo Bertolucci

Zerocracy is a platform that manages programmers remotely. Moreover, it's absolutely free for open source projects. Take a look at Cactoos or Takes---they are both managed by Zerocrat. These projects are funded by myself. I add money to Zerocracy out of my pocket and Zerocracy pays programmers when they close their microtasks in GitHub.

badge

A few days ago someone approached me by email and literally said: "There is a bug in your project, I'm happy to pay you for your time if you can come up with a solution." He was ready to donate and wanted me (or us) to solve his specific issue. I could just take his money over PayPal and fix the issue, but I'm not really an active maintainer of the project he was interested in, and I'm busy at the moment.

I realized that the best way would be to take the money, break down the problem into pieces, and delegate them to a few programmers, just like it usually works in Zerocracy. In other words, I decided to suggest he fund the project and then let us use the funds for microtasking, keeping the focus on the issue he was interested in.

Moreover, this concept was earlier proposed by @skapral.

He gladly accepted the offer. We implemented the functionality in Zerocrat and he contributed $128 via Stripe.

Now anyone can give a few dollars to a project, if it's managed by Zerocracy. The contributor will see how those funds are being spent, down to each and every dollar! Try, for example, one of these buttons and you will see detailed financial reports of each project and will be able to add your funds:

Cactoos.org:
Donate via Zerocracy

Takes.org:
Donate via Zerocracy

The advantage of this approach, compared to, for example, BountySource, is that the money will be distributed in micro-payments and will be fully traceable. I believe that this makes a difference for donators---they are interested to see how their money is being used.

© Yegor Bugayenko 2014–2018

How I Test My Java Classes for Thread-Safety

QR code

How I Test My Java Classes for Thread-Safety

  • Moscow, Russia
  • comments

I touched on this problem in one of my recent webinars, now it's time to explain it in writing. Thread-safety is an important quality of classes in languages/platforms like Java, where we frequently share objects between threads. The issues caused by lack of thread-safety are very difficult to debug, since they are sporadic and almost impossible to reproduce on purpose. How do you test your objects to make sure they are thread-safe? Here is how I'm doing it.

Scent of a Woman (1992) by Martin Brest
Scent of a Woman (1992) by Martin Brest

Let us say there is a simple in-memory bookshelf:

class Books {
  final Map<Integer, String> map =
    new ConcurrentHashMap<>();
  int add(String title) {
    final Integer next = this.map.size() + 1;
    this.map.put(next, title);
    return next;
  }
  String title(int id) {
    return this.map.get(id);
  }
}

First, we put a book there and the bookshelf returns its ID. Then we can read the title of the book by its ID:

Books books = new Books();
String title = "Elegant Objects";
int id = books.add(title);
assert books.title(id).equals(title);

The class seems to be thread-safe, since we are using the thread-safe ConcurrentHashMap instead of a more primitive and non-thread-safe HashMap, right? Let's try to test it:

class BooksTest {
  @Test
  public void addsAndRetrieves() {
    Books books = new Books();
    String title = "Elegant Objects";
    int id = books.add(title);
    assert books.title(id).equals(title);
  }
}

The test passes, but it's just a one-thread test. Let's try to do the same manipulation from a few parallel threads (I'm using Hamcrest):

class BooksTest {
  @Test
  public void addsAndRetrieves() {
    Books books = new Books();
    int threads = 10;
    ExecutorService service =
      Executors.newFixedThreadPool(threads);
    Collection<Future<Integer>> futures =
      new ArrayList<>(threads);
    for (int t = 0; t < threads; ++t) {
      final String title = String.format("Book #%d", t);
      futures.add(service.submit(() -> books.add(title)));
    }
    Set<Integer> ids = new HashSet<>();
    for (Future<Integer> f : futures) {
      ids.add(f.get());
    }
    assertThat(ids.size(), equalTo(threads));
  }
}

First, I create a pool of threads via Executors. Then I submit ten objects of type Callable via submit(). Each of them will add a new unique book to the bookshelf. All of them will be executed, in some unpredictable order, by some of those ten threads from the pool.

Then I fetch the results of their executors through the list of objects of type Future. Finally, I calculate the amount of unique book IDs created. If the number is 10, there were no conflicts. I'm using the Set collection in order to make sure the list of IDs contains only unique elements.

The test passes on my laptop. However, it's not strong enough. The problem here is that it's not really testing the Books from multiple parallel threads. The time that passes between our calls to submit() is large enough to finish the execution of books.add(). That's why in reality only one thread will run at the same time. We can check that by modifying the code a bit:

AtomicBoolean running = new AtomicBoolean();
AtomicInteger overlaps = new AtomicInteger();
Collection<Future<Integer>> futures =
  new ArrayList<>(threads);
for (int t = 0; t < threads; ++t) {
  final String title = String.format("Book #%d", t);
  futures.add(
    service.submit(
      () -> {
        if (running.get()) {
          overlaps.incrementAndGet();
        }
        running.set(true);
        int id = books.add(title);
        running.set(false);
        return id;
      }
    )
  );
}
assertThat(overlaps.get(), greaterThan(0));

With this code I'm trying to see how often threads overlap each other and do something in parallel. This never happens and overlaps is equal to zero. Thus our test is not really testing anything yet. It just adds ten books to the bookshelf one by one. If I increase the amount of threads to 1000, they start to overlap sometimes. But we want them to overlap even when there's a small number of them. To solve that we need to use CountDownLatch:

CountDownLatch latch = new CountDownLatch(1);
AtomicBoolean running = new AtomicBoolean();
AtomicInteger overlaps = new AtomicInteger();
Collection<Future<Integer>> futures =
  new ArrayList<>(threads);
for (int t = 0; t < threads; ++t) {
  final String title = String.format("Book #%d", t);
  futures.add(
    service.submit(
      () -> {
        latch.await();
        if (running.get()) {
          overlaps.incrementAndGet();
        }
        running.set(true);
        int id = books.add(title);
        running.set(false);
        return id;
      }
    )
  );
}
latch.countDown();
Set<Integer> ids = new HashSet<>();
for (Future<Integer> f : futures) {
  ids.add(f.get());
}
assertThat(overlaps.get(), greaterThan(0));

Now each thread, before touching the books, waits for the permission given by latch. When we submit them all via submit() they stay on hold and wait. Then we release the latch with countDown() and they all start to go, simultaneously. Now, on my laptop, overlaps is equal to 3-5 even when threads is 10.

And that last assertThat() crashes now! I'm not getting 10 book IDs, as I was before. It's 7-9, but never 10. The class, apparently, is not thread-safe!

But before we fix the class, let's make our test simpler. Let's use RunInThreads from Cactoos, which does exactly the same as we've done above, but under the hood:

class BooksTest {
  @Test
  public void addsAndRetrieves() {
    Books books = new Books();
    MatcherAssert.assertThat(
      t -> {
        String title = String.format(
          "Book #%d", t.getAndIncrement()
        );
        int id = books.add(title);
        return books.title(id).equals(title);
      },
      new RunsInThreads<>(new AtomicInteger(), 10)
    );
  }
}
badge

The first argument of assertThat() is an instance of Func (a functional interface), accepting an AtomicInteger (the first argument of RunsInThreads) and returning Boolean. This function will be executed on 10 parallel thread, using the same latch-based approach as demonstrated above.

This RunInThreads seems to be compact and convenient, I'm using it in a few projects already.

By the way, in order to make Books thread-safe we just need to add synchronized to its method add(). Or maybe you can suggest a better solution?

P.S. I learned all this from Java Concurrency in Practice by Goetz et al.

© Yegor Bugayenko 2014–2018

Zerocracy: A Project Manager That Never Sleeps

QR code

Zerocracy: A Project Manager That Never Sleeps

  • Moscow, Russia
  • comments

I've been in the software business for almost 30 years. I wrote my first piece of code when I was 12. Since then I have been programming and managing other programmers, hiring and firing them, planning projects and their budgets, finding and losing customers, investing into software teams and losing my investments, even teaching others how to manage software projects. What is my overall impression? It's a pain. I think I've found a solution though.

Casino (1995) by Martin Scorsese
Casino (1995) by Martin Scorsese

Software projects fail. Most of them, anyway. Miserably. They run out of money, they produce software that doesn't work, they miss deadlines, they lose key people and fall apart, and so on. Why does it happen? Not because programmers are stupid, nor technologies immature, and not because hardware is unstable.

They fail because we don't manage them.

We miss important data, we lose track of events, we don't pay attention to risks and threats, we don't plan time and money, and we don't do many other things that PMBOK says we have to do in order to keep a project under control.

What do we do instead?

We rely on our innate hierarchical instincts and hope for the best.

Here is what I mean by that. First, we put a group of people together, also known as programmers. Then, we tell them how important the goal is that they are going to achieve---develop a software product. Then we make sure they know who the boss is, by paying them by the hour. Finally, thanks to their upbringing and education, they "do the right thing" and software gets created. Maybe. If we are lucky.

This is exactly how it has worked in the Animal Kingdom, for millions of years. In order to survive, mammals create hierarchies: the strongest males are on top, others do what they say. If they don't obey, violence helps. Scientists think that humans are also very hierarchical creatures---we feel discomfort if we don't know who the boss is, who to submit to.

Thanks to this hierarchical instinct, just like lions, wolves, and monkeys, we manage our projects using force. We, unlike the animals, don't use physical violence anymore, at least not in the software business, but we have a huge arsenal of more sophisticated punishment methods.

They do work.

However, due to the complexity of our profession, the quality and efficacy of the results we observe are very low. The CHAOS Report (2015) by the Standish Group says that "software development projects are in chaos, and we can no longer imitate the three monkeys---hear no failures, see no failures, speak no failures." The report also demonstrates that as a result of this chaos we have restarts (94% projects!), cost overruns, and time overruns. The report also says that technology incompetence is the root cause of project failures only in 7% cases. In almost all other cases the management is the source of the trouble.

badge

We, at Zerocracy, believe that in the 21st century programmers (and not only them) deserve a better and a more effective replacement for this "monkey management." We believe that the software development world needs a management model which is based on people's professional merits, instead of on their ability to play the alpha/beta games.

badge

Zerocracy offers exactly that: Zerocrat, an automated project manager, which communicates as a chat bot and tells programmers what to do. It replaces a traditional "boss in the office," distributing micro tasks among programmers, validating their results, paying them, and calculating schedule, budget, and scope predictions. It does all the routine jobs a professional project manager should do on any project, but rarely does do because it's so boring. It doesn't look boring to the robot though.

Zerocrat is a project manager that never sleeps. It doesn't make mistakes, doesn't forget things, and doesn't accept excuses. It also doesn't know anything about hierarchies. It manages programmers only by their merits, which are visible via objective metrics. Thanks to this, programmers become their own bosses, reporting only to a soulless piece of software, which acts according to a very strict policy.

badge

The policy of Zerocracy is based on XDSD principles, which were introduced in 2010 and practiced since then on many software projects. The principles have proved able to seriously increase project predictability, decrease costs, and enforce quality of code. Also, they boost the motivation of the type of programmers, who like the idea of being their own bosses.

How many of those people are out there on the market---the future will soon show us.

Speaking philosophically, while it's a common belief that in the future AI-powered robots will do what we want, we believe in the opposite: Robots will tell us what to do. Management is what computers do better, while writing code, drawing diagrams, growing flowers, or cooking a soup is what we, humans, do better and actually enjoy doing. We believe that in the future computers will help us organize ourselves by taking the routine part of the management on themselves.

Zerocracy makes the first step in this direction.

© Yegor Bugayenko 2014–2018

Fluent Interfaces Are Bad for Maintainability

QR code

Fluent Interfaces Are Bad for Maintainability

  • Moscow, Russia
  • comments

Fluent interface, first coined as a term by Martin Fowler, is a very convenient way of communicating with objects in OOP. It makes their facades easier to use and understand. However, it ruins their internal design, making them more difficult to maintain. A few words were said about that by Marco Pivetta in his blog post Fluent Interfaces are Evil; now I will add my few cents.

Donnie Brasco (1997) by Mike Newell
Donnie Brasco (1997) by Mike Newell

Let's take my own library jcabi-http, which I created a few years ago, when I thought that fluent interfaces were a good thing. Here is how you use the library to make an HTTP request and validate its output:

String html = new JdkRequest("https://www.google.com")
  .method("GET")
  .fetch()
  .as(RestResponse.class)
  .assertStatus(200)
  .body();

This convenient method chaining makes the code short and obvious, right? Yes, it does, on the surface. But the internal design of the library's classes, including JdkRequest, which is the one you see, is very far from being elegant. The biggest problem is that they are rather big and it's difficult impossible to extend them without making them even bigger.

For example, right now JdkRequest has the methods method(), fetch(), and a few others. What happens when new functionality is required? The only way to add to it would be to make the class bigger, by adding new methods, which is how we jeopardize its maintainability. Here, for example, we added multipartBody() and here we added timeout().

I always feel scared when I get a new feature request in jcabi-http. I understand that it most probably means adding new methods to Request, Response, and other already bloated interfaces and classes.

I actually tried to do something in the library in order to solve this problem but it wasn't easy. Look at this .as(RestResponse.class) method call. What it does is decorate a Response with RestResponse, in order to make it method-richer. I just didn't want to make Response contain 50+ methods, like many other libraries do. Here is what it does (this is pseudo-code):

class Response {
  RestResponse as() {
    return new RestResponse(this);
  }
  // Seven methods
}
class RestResponse implements Response {
  private final Response origin;
  // Original seven methods from Response
  // Additional 14 methods
}

As you see, instead of adding all possible methods to Response I placed them in supplementary decorators RestResponse, JsonResponse, XmlResponse, and others. It helps, but in order to write these decorators with the central object of type Response we have to use that "ugly" method as(), which depends heavily on Reflection and type casting.

In other words, fluent interfaces mean large classes or some ugly workarounds. I mentioned this problem earlier, when I wrote about Streams API and the interface Stream, which is perfectly fluent. There are 43 methods!

That is the biggest problem with fluent interfaces---they force objects to be huge.

Fluent interfaces are perfect for their users, since all methods are in one place and the amount of classes is very small. It is easy to use them, especially with code auto-completion in most IDEs. They also make client code more readable, since "fluent" constructs look similar to plain English (aka DSL).

That is all true! However, the damage they cause to object design is the price, which is too high.

What is the alternative?

I would recommend you use decorators and smart objects instead. Here is how I would design jcabi-http, if I could do it now:

String html = new BodyOfResponse(
  new ResponseAssertStatus(
    new RequestWithMethod(
      new JdkRequest("https://www.google.com"),
      "GET"
    ),
    200
  )
).toString();

This is the same code as in the first snippet above, but it is much more object-oriented. The obvious problem with this code, of course, is that the IDE won't be able to auto-complete almost anything. Also, we will have to remember many of the names of the classes. And the construct looks rather difficult to read for those who are used to fluent interfaces. In addition, it's very far away from the DSL idea.

But here is the list of benefits. First, each object is small, very cohesive and they are all loosely coupled---which are obvious merits in OOP. Second, adding new functionality to the library is as easy as creating a new class; no need to touch existing classes. Third, unit testing is simplified, since classes are small. Fourth, all classes can be immutable, which is also an obvious merit in OOP.

Thus, there seems to be a conflict between usefulness and maintainability. Fluent interfaces are good for users, but bad for library developers. Small objects are good for developers, but difficult to understand and use.

It seems to be so, but only if you are used to large classes and procedural programming. To me, a large amount of small classes seems to be an advantage, not a drawback. Libraries that are clear, simple, and readable inside are much easier to use, even when I don't know exactly which classes out there are the most suitable for me. Even without the code-auto-complete I can figure it out myself, because the code is clean.

Also, I very often find myself interested in extending existing functionality either inside my code base or via a pull request to the library. I am much more interested to do that if I know that the changes I introduce are isolated and easy to test.

Thus, no fluent interfaces anymore from me, only objects and decorators.

© Yegor Bugayenko 2014–2018

Don't Aim for Quality, Aim for Speed

QR code

Don't Aim for Quality, Aim for Speed

  • Moscow, Russia
  • comments

I decided to write this blog post after reviewing this pull request. What happened there? The author of the PR wasn't able to figure out the "right" way to implement it, and the code reviewer was waiting and waiting. Eventually, the reviewer came to me, since I was an architect, and complained that it was taking too long and he wasn't able to earn his money for the review he had done. Then the author of the changes explained that he couldn't finish since there were impediments and design inconsistencies; he also couldn't earn the money he deserved for fixing the issue. What did I say? I said: Forget the quality, just finish it any way possible.

Shi mian mai fu (2004) by Yimou Zhang
Shi mian mai fu (2004) by Yimou Zhang

Was I kidding? Not at all.

I truly believe that quality is not what programmers should care about. They must care only about speed---close tasks as soon as possible--- which means make money.

Won't this attitude ruin the project and turn the code base into a mess?

Yes, it will.

If the project doesn't care about its quality either.

There must be a permanent conflict between a project and its programmers: 1) the project must be configured to reject anything that lowers the quality of its artifacts and 2) programmers must be interested in making changes to those artifacts. The project cares about the quality, the programmers care about fast delivery of modifications.

What do I mean by saying that a project rejects low quality? Here is a list of preventive measures it may take to make it impossible to jeopardize the quality:

What do I mean by saying that programmers must be interested in making changes? They have to be motivated to close tasks. Not just to be in the project, but to deliver. Here is what they can do in order to close tasks faster:

If we put these two interests in conflict, we will get a high-quality product, which is growing very fast. The project will enforce quality, programmers will push the code forward, making changes fast and frequently.

Unfortunately, most projects have a very different philosophy. They delegate quality control to programmers, hoping that they "won't do evil." This leads to frustration, distress, constant fear of mistakes, long delays, blaming, and shaming. Both the project and its programmers lose.

Programmers must not be responsible for the quality! They must not care what they may, or will, break. They must not care how good the code they write is. They must not "feel responsible" for the overall result. Instead, they must be focused on making money for their families by writing the largest amount of code and closing more tickets.

Not because they are ignorant and selfish, but because this is the right balance of responsibilities. This is how the project will get the most out of its developers---by freeing their minds of unnecessary and unproductive quality worries and letting them focus on what they do best---writing code.

Of course, not every project will be able to configure itself in the most effective way. Most projects don't even know how to do it. In those projects, if you, as a developer, floor the speed pedal, you will most likely ruin their code base in a few days. That's why the recommendations above are only applicable to those who really know what they are doing.

We know what we are doing in our projects. We don't let any developers touch any parts of our code, unless the "quality wall" is high and strong enough. How high is that wall in your projects? Can you say that, no matter how bad some code is and how sneakily its author introduces it, it will be rejected?

© Yegor Bugayenko 2014–2018

Don't Parse, Use Parsing Objects

QR code

Don't Parse, Use Parsing Objects

  • Moscow, Russia
  • comments

The traditional way of integrating object-oriented back-end with an external system is through data transfer objects, which are serialized into JSON before going out and deserialized when coming back. This way is as much popular as it is wrong. The serialization part should be replaced by printers, which I explained earlier. Here is my take on deserialization, which should be done by---guess what---objects.

La science des rêves (2006) by Michel Gondry
La science des rêves (2006) by Michel Gondry

Say there is a back-end entry point, which is supposed to register a new book in the library, arriving in JSON:

{
  "title": "Object Thinking",
  "isbn: "0735619654",
  "author: "David West"
}

Also, there is an object of class Library, which expects an object of type Book to be given to its method register():

class Library {
  public void register(Book book) {
    // Create a new record in the database
  }
}

Say also, type Book has a simple method isbn():

interface Book {
  String isbn();
}

Now, here is the HTTP entry point (I'm using Takes and Cactoos), which is accepting a POST multipart/form-data request and registering the book in the library:

public class TkUpload implements Take {
  private final Library library;
  @Override
  public Response act(Request req) {
    String body = new RqPrint(
      new RqMtSmart(new RqMtBase(req)).single("book")
    ).printBody();
    JsonObject json = Json.createReader(
      new InputStreamOf(body)
    ).readObject();
    Book book = new BookDTO();
    book.setIsbn(json.getString("isbn"));
    library.register(book);
  }
}

What is wrong with this? Well, a few things.

First, it's not reusable. If we were to need something similar in a different place, we would have to write this HTTP processing and JSON parsing again.

Second, error handling and validation are not reusable either. If we add it to the method above, we will have to copy it everywhere. Of course, the DTO may encapsulate it, but that's not what DTOs are usually for.

Third, the code above is rather procedural and has a lot of temporal coupling.

A better design would be to hide this parsing inside a new class JsonBook:

class JsonBook implements Book {
  private final String json;
  JsonBook(String body) {
    this.json = body;
  }
  @Override
  public String isbn() {
    return Json.createReader(
      new InputStreamOf(body)
    ).readObject().getString("isbn");
  }
}

Then, the RESTful entry point will look like this:

public class TkUpload implements Take {
  private final Library library;
  @Override
  public Response act(Request req) {
    library.register(
      new JsonBook(
        new RqPrint(
          new RqMtSmart(
            new RqMtBase(req)
          ).single("book")
        ).printBody()
      )
    );
  }
}

Isn't that more elegant?

Here are some examples from my projects: RqUser from zerocracy/farm and RqUser from yegor256/jare.

As you can see from the examples above, sometimes we can't use implements because some primitives in Java are not interfaces but final classes: String is a "perfect" example. That's why I have to do this:

class RqUser implements Scalar<String> {
  @Override
  public String value() {
    // Parsing happens here and returns String
  }
}

But aside from that, these examples perfectly demonstrate the principle of "parsing objects" suggested above.

© Yegor Bugayenko 2014–2018

Microvesting

QR code

Microvesting

  • Moscow, Russia
  • comments

Most startups don't have enough cash to pay programmers as much as they deserve, unfortunately (or maybe not). Instead of cash, startups give their early employees shares of stock, which they will be able to either 1) sell in a few years and become millionaires billionaires, or 2) throw away and remain nobodies. It's a common practice. The question, however, is what is the right procedure, and the optimal algorithm, to transfer those shares to programmers. When exactly do they become shareholders? What is the formula?

La comunidad (2000) by Álex de la Iglesia
La comunidad (2000) by Álex de la Iglesia

There are a few typical approaches.

One of the most popular is "four years with a one-year cliff," which means that if they had 50% equity and leave after two years they will only retain 25%. The longer they stay, the larger the percentage of their equity that will be vested until they become fully vested in the 48th month. However, because they have a one year cliff, if they leave before the 12th month, they get nothing. There could be slight modifications to the numbers, of course.

The disadvantage of this approach is that their primary motivation is to stay in the company, instead of achieving results. This vesting formula is perfectly aligned with the popular be nice paradigm and is not beneficial, either to the company or to its slaves employees.

Another option is milestone-based vesting, which defines a set of value milestones, each of which unlocks an additional part of the programmer's equity.

On top of the inability to predict milestones accurately, this vesting formula promotes group responsibility, which, in my opinion, is the least effective way to motivate. Programmers writing Java classes can't be responsible for the "next round of VC funding," simply because they don't have any idea how to make that round happen. It's not their job, not their responsibility.

You may say that writing those Java classes is exactly how we make the next round happen, but it's far from being true, in most cases. We all know that investments come to those who can fool pitch an investor, not to those who write the best Java code. Thus, the work programmers do and the "value events" the startup is aiming to reach are pretty much disconnected.

A more logical formula is microvesting, which we practice in projects managed by Zerocracy. It is as simple as that: A company has a valuation, which is set by its founders; let's say, it's $1,000,000. A programmer has an hourly rate, say, $40. Thus, when a one-hour fixed-budget task is completed, the programmer earns 0.004% of equity ($40 / $1,000,000). Our software calculates it all automatically, increasing their shares after each completed task.

Using these two variables---valuation and hourly rate---the company can influence programmers' motivation.

No need to lie to them about big-money milestones or keep them in the office for four years. Just let them be focused on the results they can produce and give them back what they deserve. Incrementally. That's it.

© Yegor Bugayenko 2014–2018

More Bugs, Please

QR code

More Bugs, Please

  • Moscow, Russia
  • comments

A bug is something we find in a software product that "doesn't look right" (this is my personal definition). A bug can be hidden or visible; it can be "already fixed" or "still present"; it can be critical or cosmetic; it can be urgent or of a low priority. What is important is that the more bugs we are able to find and fix before our customers see them, the higher the perceived quality of the software. Simply put, bugs are a very good thing, if they are found by us, not our customers. We pay our programmers for each bug they find. Here is a cheat sheet for them, showing where and how they can find those bugs, to make more money.

American Honey (2016) by Andrea Arnold
American Honey (2016) by Andrea Arnold

Obviously, if something is broken, it's a bug; no need to mention it here. However, when a product is more or less stable, not too many things are visibly broken. But we still pay for bugs. What should you look out for? Read on. This list (in no particular order) will help you.

Lack of functionality. If a class (yegor256/cactoos#558) or the entire module (yegor256/cactoos#399) doesn't provide the functionality you expect it to have, it's a bug.

Lack of tests. If a class doesn't have a unit test (yegor256/takes#43) or the existing test doesn't cover some critical aspects of the class (yegor256/cactoos#375), it's a bug.

Lack of documentation. If, say, a Javadoc block for a class does not clearly explain to you how to use the class, or the entire module is not documented well (yegor256/takes#790), it's a bug.

Suboptimal implementation. If a piece of code doesn't look good to you, and you think it can be refactored to look better, it's a bug.

Design inconsistency. If the design doesn't look logical to you (yegor256/cactoos#436) and you know how it can be improved, it's a bug.

Naming is weird. If class, variable or package names don't look consistent and obvious to you, and you know how they can be fixed (yegor256/cactoos#274), it's a bug.

Unstable test. If a unit test fails sporadically (yegor256/takes#506) or doesn't work in some particular environment (yegor256/jpeek#151), it's a bug.

Also, it's worth mentioning that minor, cosmetic, or poorly formulated bug reports will most likely be rejected or not paid for. If you want us to pay for your bug reports, make sure they sound right, in order to help us move the project forward to a better state.

© Yegor Bugayenko 2014–2018

Are You a Coder or a Developer?

QR code

Are You a Coder or a Developer?

  • Moscow, Russia
  • comments

Software development and coding are two different things. Usually, the former includes the latter, but not always. Coding produces lines of code, while software development creates products. Unfortunately, the majority of programmers joining Zerocracy now are coders. Even though they claim to be developers, in reality they are lacking the very important sociotechnical skills that differentiate product creators from lines-of-code writers.

Hard Men (1996) by J.K. Amalou
Hard Men (1996) by J.K. Amalou

Let me show you the symptoms first.

Let's call him Mario. He is a very skilled Java developer, as his resume says. He's been in the industry for ten years or so, done a few enterprise projects; he seems to be very seasoned. We give him access to the project and assign a few GitHub tickets.

In a few hours I get a Facebook message a page long. It says that he is very glad to be on the project, but doesn't understand a thing yet and needs help. Here is a list of questions he prepared and he's ready for a phone call to get them answered.

I reply: "Dude, I love you like a brother, but I don't have time to answer your questions. Not because I'm lazy or don't appreciate your work... Actually, yes, exactly because of that. I am lazy and don't want to answer any questions over Messenger. My answers will be totally wasted, if you, for example, quit the project tomorrow. Or if someone else joins us in a week and has exactly the same set of questions. Do I have to explain all over again? I'm too old lazy for that."

He most probably thinks that I'm an arrogant prick, but what can he do? He reads my article on this very subject and says "OK, I got it."

In half an hour Mario submits a ticket to another (!) repository. The title is "The problem" and the description says "Help me understand the project."

What do I do, as an architect of the project? I close the ticket with a quick message: "Please, make your tickets more specific." My response is just one step away from "Get lost," but what else can I say? Mario doesn't know how to use the ticketing system. He's most probably been working all his life in a cozy office, where everybody around was his friend. Not even using chat, just asking questions across the table. I'm asking him to do something he has never done before. Of course, he doesn't know how. He feels ashamed, I suspect.

What happens next? He comes right back at me in Messenger, with the same set of questions. Actually, his reaction will depend on his personality. It may either be anger, confusion, or something else. But the bottom line is that Mario is not a software developer, he's a coder. He doesn't understand the dynamics of a modern software project, he doesn't know how to use its communication instruments, and he has no sociotechnical skills:

  • Searching for, and finding, information
  • Submitting questions, collecting answers
  • Adding knowledge to the repository
  • Submitting code changes
  • Arguing in writing, reviewing changes
  • Closing tickets and preventing them from closing
  • Maintaining discipline in repositories

The same happens to almost everybody who joins us, unfortunately.

A modern software project is much more a social activity than code writing. Knowing how to interact with the team and deal with information is much more important than knowing how to use design patterns. The only way to learn these skills is practice. I've said it many times, let me repeat it again: If you are not an open source and StackOverflow activist, you most likely won't have these skills.

© Yegor Bugayenko 2014–2018

The Educational Aspect of Static Analysis

QR code

The Educational Aspect of Static Analysis

  • Moscow, Russia
  • comments

Very often new programmers who join our projects ask us whether we have auto-formatting instruments to make Java code look exactly the way Qulice expects. (Qulice is the static analyzer we use.) I always reply that having such an automated code polisher would only be harmful and wouldn't help the project and its members improve and grow. Here is why I think so.

Blind Fury (1989) by Phillip Noyce
Blind Fury (1989) by Phillip Noyce

Static analysis, the way we do it in combination with read-only master branch, is a fully automated uncompromising review of your pull request, mostly intended to spot code formatting mistakes. Say we want Java code in our entire repository to look like this:

final class Doc {
  private final File file;
  public void remove() {
    if (this.file.exists()) {
      this.file.delete();
    }
  }
}

However, you refactor it as part of a bigger task, and submit a pull request like this:

class Doc {
  private File f;
  public void remove()
  {
    if (f.exists())
      f.delete();
  }
}

For some of you this may not seem like a big difference, since both code snippets compile without issues and work exactly the same way. However, for us, the repository maintainers, it is a big deal. We do want our classes to always be final, we do want them to be immutable (so all attributes should also be final), we want to prefix all attribute references with this., and we want the code to be formatted the same way, since we believe that the uniformity of the code seriously increases its maintainability.

Of course, we could create a tool which you could then use to re-format the code, to make it look the way we want. But in that case you would never learn what the project wants from you and why.

You will not know the reasoning behind our rules. You will never think about them. You will not really care about them. But they are not only about the formatting of spaces and brackets. There are over 900 of them in Qulice and some of them were designed especially for the object-oriented philosophy we are preaching.

Thus, simply put, we don't want you to go through the static analysis phase easily. We want you to suffer in order to learn.

© Yegor Bugayenko 2014–2018

Five Stages of Microbudgeting

QR code

Five Stages of Microbudgeting

  • Moscow, Russia
  • comments

Microtasking, which I explained in an earlier post, works only when each task has a very specific reward for success and a punishment for failure. I believe that the best reward and punishment instrument is money. The budget is fixed, the programmer gets it only when the task is completed (reward), no matter how much time it cost; if it is not completed, there is no money at all (punishment). Pure and simple. However, a logical question arises: how can we know upfront what is the right budget? Who sets it?

Taxi Driver (1976) by Martin Scorsese
Taxi Driver (1976) by Martin Scorsese

When we started to play with microtasking in our projects, in 2009, we were asking programmers to estimate each task. It did work, but only with very simple and obvious tasks. More complex ones almost always suffered from either under-estimating or padding---numbers were either very small and task performers were complaining in the end, or they were too big and customers were asking for refunds. It was not a manageable situation.

Then, we realized that it would be better if all tasks were rather small, with exactly the same budget. We tried to use two hours as a universal and fixed estimate. Everything else that didn't fit---programmers were allowed to reject. This model didn't really work either, because our managers had to deal with a very large amount of rejected tasks and didn't know how to make them smaller, since they were not programmers.

Finally, in March 2010 we found a solution, which was labeled Puzzle Driven Development (PDD). According to this concept: 1) Any task has a very small fixed budget (we use 30 minutes); 2) The task performer is allowed to complete only part of the task; 3) The code that is being returned to master must include @todo markers, called "puzzles"; 4) Puzzles are automatically converted to new tasks.

The beauty of this approach is that the most complicated part of the software project management---scope decomposition---is moved to the shoulders of those who are the best at it: programmers.

We are using PDD in all our projects now and have even created a public instrument for GitHub repositories, which allows anyone to play with PDD at no cost: 0pdd.com. This is exactly the same tool we are using in our commercial projects.

However, if and when you decide to apply microbudgeting to your project, together with PDD, there will be problems. Psychological ones mostly. In my experience, people go through five stages when they face microbudgeting for the first time:

  • Denial. They ask "How is it possible?" and then refuse to hear any explanations. There are many reasons why microbudgeting and microtasking may not work---you will hear them all. Very often they demand a traditional model of payment, especially if they were invited. They just say that our model is insane, and if we want to see them work on our projects we have to pay for as much time as they spend. Most of them quit.

  • Anger. Some of them decide to try. Thanks to their previous multi-year experience, they expect to be paid by the end of the day/week/month, no matter what they were doing. Very soon they realize that the total income for the first day of work was $0.00, even though they were doing something. They get very angry. They call us crooks, fraudsters, and many other names. Asking them to read the policy again doesn't help. They simply can't believe that we are not going to pay them anything, even though they were doing something. Most of them quit.

  • Bargaining. Almost everybody at this stage recommends we change the model. They explain why it's not really effective and how great it would be if we would pay them the traditional way. They give us examples of their previous projects, send references from previous employees, and criticize my blog posts. With some of them I try to argue, when their criticism is constructive. Most of them quit.

  • Depression. Most programmers are used to doing work because they feel guilty if the task is not done or the bug is not fixed. Microbudgeting requires a completely opposite attitude: we all are supposed to work because we are greedy. Money has to motivate us, not guilt. If there is no money, we don't work. Most people, when they see this new motivational paradigm and don't see the usual guilt, lose coordination and don't know what to do. They can't really achieve anything, because there is no traditional manager standing behind them and pushing them forward. They are supposed to go for the money. They don't, and so they don't make any money. Most of them quit.

  • Acceptance. Finally, the best of them realize that the model can work if they follow the rules, which are very simple: be greedy, selfish, egoistic, money-driven, result-oriented, lazy, misanthropic, heartless, and arrogant. They accept the fact that they lose, compete, work, and make money only when they produce results. They start enjoying meritocracy at its best.

You understand already that the vast majority of those who try to work with us can't really get to the final point---they quit somewhere in the middle. Most probably something very similar will happen on your projects too.

What is the solution? I don't really know.

Statistically speaking, three to five people out of a hundred manage to survive and become effective and productive. Thus, to build a team of twenty people you will have to screen and try at least 400.

© Yegor Bugayenko 2014–2018

Operator new() is Toxic

QR code

Operator new() is Toxic

  • Moscow, Russia
  • comments

To instantiate objects, in most object-oriented languages, including Java, Ruby, and C++, we use operator new(). Well, unless we use static factory methods, which we don't use because they are evil. Even though it looks so easy to make a new object any time we need it, I would recommend to be more careful with this rather toxic operator.

The Gift (2015) by Joel Edgerton
The Gift (2015) by Joel Edgerton

I'm sure you understand that the problem with this operator is that it couples objects, making testing and reuse very difficult or even impossible. Let's say there is a story in a file that we need to read as a UTF-8 text (I'm using TextOf from Cactoos):

class Story {
  String text() {
    return new TextOf(
      new File("/tmp/story.txt")
    ).asString();
  }
}

It seems super simple, but the problem is obvious: class Story can't be reused. It can only read one particular file. Moreover, testing it will be rather difficult, since it reads the content from exactly one place, which can't be changed at all. More formally this problem is known as an unbreakable dependency---we can't break the link between Story and /tmp/story.txt---they are together forever.

To solve this we need to introduce a constructor and let Story accept the location of the content as an argument:

class Story {
  private final File file;
  Story(File f) {
    this.file = f;
  }
  String text() {
    return new TextOf(this.file).asString();
  }
}

Now, each user of the Story has to know the name of the file:

new Story(new File("/tmp/story.txt"));

It's not really convenient, especially for those users who were using Story before, knowing nothing about the file path. To help them we introduce a secondary constructor:

class Story {
  private final File file;
  Story() { // Here!
    this(new File("/tmp/story.txt"));
  }
  Story(File f) {
    this.file = f;
  }
  String text() {
    return new TextOf(this.file).asString();
  }
}

Now we just make an instance through a no-arguments constructor, just like we did before:

new Story();

I'm sure you're well aware of this technique, which is also known as dependency injection. I'm actually not saying anything new. What I want you to pay attention to here is the location and the amount of new operators in all three code snippets.

In the first snippet both new operators are in the method text(). In the second snippet we lost one of them. In the third snippet one operator is in the method, while the second one moved up, to the constructor.

Remember this fact and let's move on.

What if the file is not in UTF-8 encoding but in KOI8-R? Class TextOf and then method Story.text() will throw an exception. However, class TextOf is capable of reading in any encoding, it just needs to have a secondary argument for its constructor:

new TextOf(this.file, "KOI8_R").asString();

In order to make Story capable of using different encodings, we need to introduce a few additional secondary constructors and modify its primary constructor:

class Story {
  private final Text text;
  Story() {
    this(new File("/tmp/story.txt"));
  }
  Story(File f) {
    this(f, StandardEncodings.UTF_8);
  }
  Story(File f, Encoding e) {
    this(new TextOf(f, e));
  }
  Story(Text t) {
    this.text = t;
  }
  String text() {
    return this.text.asString();
  }
}

It's just dependency injection, but pay attention to the locations of the operator new. They are all in the constructors now and none of them are left in the method text().

The tendency here is obvious to me: the more the new operators stay in the methods, the less reusable and testable is the class.

In other words, operator new is a rather toxic thing, so try to keep its usage to a minimum in your methods. Make sure you instantiate everything or almost everything in your secondary constructors.

© Yegor Bugayenko 2014–2018

The Formula for Software Quality

QR code

The Formula for Software Quality

  • Voronezh, Russia
  • comments

How do you define the quality of a software product? There is definitely an intrinsic emotional component to it, which means satisfaction for the user, willingness to pay, appreciation, positive attitude, and all that. However, if we put emotions aside, how can we really measure it? The IEEE says that quality is the degree to which a product meets its requirements or user expectations. But what is the formula? Can we say that it satisfies requirements and expectations to, say, 73%?

Coco Chanel & Igor Stravinsky (2009) by Jan Kounen
Coco Chanel & Igor Stravinsky (2009) by Jan Kounen

Here is the formula and the logic I'm suggesting.

As we know, any software product has an unlimited number of bugs. Some of them are discovered and fixed by the development team, let's call them F. Some of them are discovered by the end users, let's call them U. Thus, the total amount of bugs we are aware of, out of an infinity of them, is F+U.

Obviously, the smaller U is, the higher the quality. Ideally, U has to be zero, which will mean that users don't see any bugs at all. How can we achieve that, if the total amount of bugs is infinite? The only possible way to do it is to increase F, hoping that U will decrease automatically.

Thus, the quality of a product can be measured as:

equation

We simply divide the amount of bugs found by the total amount of bugs visible. Thus, the more bugs we manage to find before our users see them, the higher the quality.

A quality of 100% means that no bugs are found by the users. A quality of 0% means that all bugs are found by them.

Does it make sense?

badge

P.S. It seems that I'm not the inventor of the formula. This is the quote from Managing the Testing Process: Practical Tools and Techniques for Managing Hardware and Software Testing (2009) by Rex Black, page 109: A common metric of test team effectiveness measures whether the test team manages to find a sizeable majority of the bugs prior to release. The production or customer bugs are sometimes called test escapes. The implication is that your test team missed these problems but could reasonably have detected them during test execution. You can quantify this metric as follows:

equation

P.P.S. Here is another similar metric by Capers Jones at Software Defect Removal Efficiency, Computer, Volume 29, Issue 4, 1996: "Serious software quality control involves measurement of defect removal efficiency (DRE). Defect removal efficiency is the percentage of defects found and repaired prior to release. In principle the measurement of DRE is simple. Keep records of all defects found during development. After a fixed period of 90 days, add customer-reported defects to internal defects and calculate the efficiency of internal removal. If the development team found 90 defects and customers reported 10 defects, then DRE is of course 90%."

© Yegor Bugayenko 2014–2018

SRP is a Hoax

QR code

SRP is a Hoax

  • Moscow, Russia
  • comments

The Single Responsibility Principle, according to Robert Martin's Clean Code, means that "a class should have only one reason to change." Let's try to decrypt this rather vague statement and see how it helps us design better object-oriented software. If it does.

The Thomas Crown Affair (1999) by John McTiernan
The Thomas Crown Affair (1999) by John McTiernan

I mentioned SRP once in my post about SOLID, saying that it doesn't really help programmers understand the good old "high cohesion" concept, which was introduced by Larry Constantine back in 1974. Now let's see it by example and analyze how we can improve a class, with the SRP in mind, and whether it will become more object-oriented.

Let's try the class AwsOcket from jcabi-s3 (I've simplified the code):

class AwsOcket {
  boolean exists() { /* ... */ }
  void read(final OutputStream output) { /* ... */ }
  void write(final InputStream input) { /* ... */ }
}

Correct me if I'm wrong, but according to SRP this class is responsible for too many things: 1) checking the existence of an object in AWS S3, 2) reading its content, and 3) modifying its content. Right? It's not a good design and it must be changed.

In order to change it and make it responsible for just one thing we must introduce a getter, which will return an AWS client and then create three new classes: ExistenceChecker, ContentReader, and ContentWriter. They will check, read, and write. Now, in order to read the content and print it to the console I'm currently doing this:

if (ocket.exists()) {
  ocket.read(System.out);
}

Tomorrow, if I refactor the class, I will be doing this:

if (new ExistenceChecker(ocket.aws()).exists()) {
  new ContentReader(ocket.aws()).read(System.out);
}

Aside from the fact that these checkers, readers, and writers are not really classes, but pure holders of procedures, the usage of this ocket turns into a nightmare. We can't really know anymore what will happen with it when we pass it somewhere. We can't, for example, guarantee that the content that is coming from it is decrypted or decoded on the fly. We simply can't decorate it. It is not an object anymore, but a holder of an AWS client, which is used by some other classes somewhere.

Yes, now it is responsible for only one thing: encapsulating the reference to the AWS client. It is a perfect class as far as SRP is concerned. But it is not an object anymore.

The same will happen with any class if you apply the SRP principle to its full extent: it will become a holder of data or of other objects, with a collection of setters and getters on top of them. Maybe with one extra method in addition to those.

My point is that SRP is a wrong idea.

Making classes small and cohesive is a good idea, but making them responsible "for one thing" is a misleading simplification of a "high cohesion" concept. It only turns them into dumb carriers of something else, instead of being encapsulators and decorators of smaller entities, to construct bigger ones.

In our fight for this fake SRP idea we lose a much more important principle, which really is about true object-oriented programming and thinking: encapsulation. It is much less important how many things an object is responsible for than how tightly it protects the entities it encapsulates. A monster object with a hundred methods is much less of a problem than a DTO with five pairs of getters and setters! This is because a DTO spreads the problem all over the code, where we can't even find it, while the monster object is always right in front of us and we can always refactor it into smaller pieces.

Encapsulation comes first, size goes next, if ever.

© Yegor Bugayenko 2014–2018

Alan Kay Was Wrong About Him Being Wrong

QR code

Alan Kay Was Wrong About Him Being Wrong

  • Moscow, Russia
  • comments

From time to time someone asks me what I think about what Alan Kay, the father of OOP, the designer of Smalltalk, the first object-oriented language, said in 1998 about OOP. He literally said that the very term "object" was misleading and a more appropriate one would be "messaging." Here is what I think.

Rain Man (1988) by Barry Levinson
Rain Man (1988) by Barry Levinson

I believe that there are two orthogonal means of interaction between objects: messaging and composition. Let's say, we have a point and a canvas:

Point p = new Point(x, y);
Canvas canvas = new Canvas();

This is how messaging would look:

p.printTo(canvas);

The problem with messaging is that it keeps objects on the same level of abstraction. They communicate as equal and independent "modules," sending data messages to each other. Even though they look object-oriented, the entire communication pattern is very procedural. We try to encapsulate as much as we can inside a single object, however inevitably still having to expose a lot of its data in order to be able to "connect" it with other objects.

We turn objects into "little computers," as some books refer to them. They expect data to come in, they process the data, and return back some new data. The maintainability problem is not really solved with this approach---we still have to deal with a lot of data, remembering its semantic outside of the objects. In other words, there is no true encapsulation.

On the other hand, this is how composition would look instead:

Point p2 = new PrintedOn(p, canvas);

Every time we need objects to communicate we create a bigger object that encapsulates more primitive ones, letting them interact inside. Of course, the data will also go from object to object, but that will happen inside a bigger object. We can even make the encapsulator and the encapsulated "friends," as I suggested before, to make that interaction more transparent and avoid data exposure through getters or even printers.

Let me quote Alan Kay again:

The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be.

It seems to me that he means modules, which are not objects. These are different things. Modules are elements of the architecture, while objects are elements of the design. These are two different levels. At the level of architecture we obviously need messages and Kay's statement is perfectly correct. However, at the level of design we need composable structures, to increase maintainability and messaging is not what can help us achieve this goal.

Thus, I believe Alan Kay was right when he invented objects, called them objects, and gave their programming style the "object-oriented" title.

© Yegor Bugayenko 2014–2018

DAO is Yet Another OOP Shame

QR code

DAO is Yet Another OOP Shame

  • Odessa, Ukraine
  • comments

Someone asked me what I think about DAO and I realized that, even though I wrote about ORM, DTO, and getters, I haven't had a chance yet to mention DAO. Here is my take on it: it's as much of a shame as its friends---ORM, DTO, and getters. In a nutshell, a Data Access Object is an object that "provides an abstract interface to some type of database or other persistence mechanism." The purpose is noble, but the implementation is terrible.

Requiem for a Dream (2000) by Darren Aronofsky
Requiem for a Dream (2000) by Darren Aronofsky

Here is how it may look:

class BookDAO {
  Book find(int id);
  void update(Book book);
  // Other methods here ...
}

The idea is simple---method find() creates a DTO Book, someone else injects new data into it and calls update():

BookDAO dao = BookDAOFactory.getBookDAO();
Book book = dao.find(123);
book.setTitle("Don Quixote");
dao.update(book);

What is wrong, you ask? Everything that was wrong with ORM, but instead of a "session" we have this DAO. The problem remains the same: the book is not an object, but a data container. I quote my own three-year-old statement from the ORM article, with a slight change in the name: "DAO, instead of encapsulating database interaction inside an object, extracts it away, literally tearing a solid and cohesive living organism apart." For more details, please check that article.

However, I have to say that I have something like DAOs in most of my pet projects, but they don't return or accept DTOs. Instead, they return objects and sometimes accept operations on them. Here are a few examples. Look at this Pipes interface from Wring.io:

interface Pipes {
  void add(String json);
  Pipe pipe(long number);
}

Its method add() creates a new item in the "collection" and method pipe() returns a single object from the collection. The Pipe is not a DTO, it is a normal object that is fully capable of doing all necessary database operations, without any help from a DAO. For example, there is Pipe.status(String) method to update its status. I'm not going to use Pipes for that, I just do pipe.status("Hello, world!").

Here is yet another example from Jare.io: interface Base which returns a list of objects of type Domain. Then, when we want to delete a domain, we just call domain.delete(). The domain is fully capable of doing all necessary database manipulations.

The problem with DAO is right in its name, I believe. It says that we are accessing "data" and does exactly that: goes to the database, retrieves some data, and returns data. Not an object, but data, also known as a "data transfer object." As we discussed before, direct data manipulations are what break encapsulation and make object-oriented code procedural and unmaintainable.

© Yegor Bugayenko 2014–2018

How Micro Is Your Tasking?

QR code

How Micro Is Your Tasking?

  • Moscow, Russia
  • comments

"What are you doing now?"---when you hear this question from your boss, be aware: you're dealing with a micromanager. Keeping us busy is the key objective of these creatures and this is what makes them so annoying. To the contrary, effective managers make sure we are productive, meaning that our results satisfy their expectations. They are not interested in knowing what we are doing to deliver them---they manage the project, instead of managing us. And the first step to making the project manageable is to decompose its scope into smaller pieces.

Carlito's Way (1993) by Brian De Palma
Carlito's Way (1993) by Brian De Palma

Imagine you want to re-design your apartment, having a few thousand dollars for this job. You hire a group of people and give them all your money up front. They ask you to come back in two months, when everything will be ready. You say "OK" and wait for two months. I'm sure you already know what I'm getting at---this project most probably will be a failure, to some extent. In the worst case you won't see these guys ever, they will just steal your money. In the best case, they will do something that will look "nice," but not as nice as you expected.

Why do we micromanage?

What do you do in order to increase your chances of getting the best case scenario? That's right, you micromanage them: you visit them every day, you ask them the famous "What are you doing now?" question, you push them when they are getting lazy, you control, you dominate, you annoy, you "stay on top," you play the guilt card when they miss or forget, you punish them every way you can.

You don't do that because you're evil. You just know that otherwise they will trash your apartment, will forget things, will miss something, will make mistakes, will spend more time and money than they are supposed to, will choose wrong fabrics, will purchase the furniture you don't like, and will do many other things you're well aware of if you've ever dealt with interior designers and house builders.

The more aggressive you are, the higher the chances you win.

And it's not because you are evil. You're not evil, you're stupid (not you personally, my dear and respected reader, but you get the point).

The problem is that the project is not manageable. That's why you have to resort to the last possible measure---micromanagement. Why is the project not manageable? Because its scope is not broken down into pieces. It contains a large single job called "Redesign The Apartment."

One of the key success factors for manageability is the famous 0/100 rule, which requires any task to be either "in progress" or "complete." There can be nothing in the middle. When such a rule is in place, the task can be delegated to its performer and they can become responsible for its completion, they can be trusted.

We can't "trust" our single large task to the performer, simply because it's too big to be trusted. If they fail, the cost of failure will be too high. We have to take a micro-scope and get into the task to manage it from the inside, annoying its performers, whom we are supposed to trust. The micromanagement we do is inevitable, because the scope is not broken down.

Scope decomposition was invented mostly in order to solve this very problem: to make the project more manageable. We need small tasks in the scope in order to be able to delegate them and never go inside in order to check what's going on there, who is doing what, why, and where.

The smaller the tasks we can break the scope into, the better.

How small can tasks be?

In our projects we break project scope into tasks of 30 minutes each. This may sound too extreme for you, but it works for us. We call them micro-tasks. We started to practice micro-tasking about seven years ago. We experimented with different task sizes, from 10 hours to 15 minutes, but eventually came down to 30 minutes.

When tasks are bigger, we lose the manageability and simply get back to macro-tasking. When tasks are smaller, the context switching overhead becomes too annoying.

In our experience, a senior programmer, if fully dedicated to a project, completes 6-10 tasks a day. This means that they spend 3-5 hours working, while the rest of the time is spent on doing something else. This is a much more effective use of work time than we can achieve with traditional macro-tasking management, where programmers barely work for two hours a day, spending the rest of their time on something else (my personal observation).

What obstacles did we have?

If you decide to go for micro-tasking, you will most likely have the same or similar obstacles that we've had. Here is a short list of them and my advice, which may help if one day you decide to break a "Develop a Mobile App" task into, say, 2,500+ micro tasks:

  • Vagueness. Definition of done, exit criteria, requirements, expectations---there are many names, which are all about our inability to define what exactly we expect programmers to do. Software development by definition includes a lot of unknowns, creativity, experiments, and risks. Programmers will tell you that they can't guarantee anything in most cases, and can't really give any valid promises. We solve that by PDD.

  • Distraction. Programmers are used to doing many different things at the same time: they write code, help others, watch YouTube, scroll Facebook, swear at Reddit, and read my blog. Initially most of them won't like the idea of having explicit tasks in front of them, simply because they put a structure on their work time, making it much more visible to the management. They will tell you that they have many other things to do aside from your bloody tasks. We solve that by paying by result and prohibition of chitchats.

  • Laziness. Just like that apartment designers, programmers also love to get paid and do nothing. Who doesn't, right? They will tell you that their work is more complicated than you think, that they need much more time, that they need to investigate the problem first, that reading documentation also takes a lot of time, etc. They are simply spoiled by the traditional macro-tasking where they are paid by the month and nobody really controls their results. They are used to being office slaves lazy. We solve that by paying by result and the lazy ones simply quit.

  • Responsibility. Micro-tasking will make individual results visible. When tasks are large people tend to work with them in groups and it's unclear exactly who is responsible for failures and successes. Smaller tasks emphasize mistakes and make people "pay" for them. Not necessarily with cash, but definitely some way or other. Most programmers will find this concept very new and disturbing---they've almost never paid for their own mistakes before and never had their own tasks. The responsibility was always spread across the group. We solve that by monetary rewards and punishments, which make them very motivated and failure-ready.

  • Resentment. This is one of the most popular and the most annoying of problems: they will tell you that they are "not monkeys." They will actually combine all problems listed above and say that the right way to solve them is to give programmers freedom and let them do their job, since they are smart enough and don't need any management from the top. And they will mostly be saying it quite honestly, with no intent to manipulate. The thing is that they are used to macro-tasking and believe that this is the only and the right way. I'm trying to solve this by writing this very blog post.

There could be something else, but this is a more or less an exhaustive list of the problems we were faced with.

Where micro-tasking didn't work

Obviously, any approach has its pros and cons. Micro-tasking seems to be the most effective management paradigm for us. However, it's not applicable everywhere, according to our experience. There are territories where we failed to apply it.

  • Prototyping. Every project or a new big feature starts with a prototype, which requires one or two people sitting next to each other, thinking, and experimenting. We tried to break this piece of work into smaller parts, but failed. It seems that this process has to be done as a single solid work package.

  • UI/UX. We were mostly working with server-side Java/Ruby/C++ projects for the last few years and didn't have many opportunities to apply micro-tasking to the UI/UX jobs. However, whatever we did try never really worked: graphic designers weren't able to decompose their tasks into smaller parts.

  • Customers. We tried to decompose the task of eliciting requirements from our clients and failed a few times. Maybe our clients were stupid (I doubt that), maybe requirements were too complex, or maybe our system analysts were not professional enough. The bottom line is that we realized that such a task must be done as a solid piece of work, without any decomposition.

  • Fire-fighting. When the speed of delivery is the most important concern, micro-tasking doesn't work for us. The overhead for dispatching and specifying tasks was taking too much time. When something is really urgent, we have to do the traditional macro-tasking and just "make it work." Then we get back to micro-tasking.

Everything else, including programming, unit testing, manual testing, performance/load testing, integration testing, deployment, code review, documentation, and even training, can and must be managed via micro tasks.

What benefits do we get?

The most important benefit of micro-tasking is that the project becomes more manageable, as was mentioned above. Here is a more detailed break down:

  • Money works. When tasks are very small, we can use good old dollar bills to motivate programmers. We can throw away literally all other instruments.

  • We pay less. We seriously lower our expenses, even though hourly rates of our programmers are higher than many other projects can afford. I did a more or less detailed analysis a few years ago, which demonstrated that our projects were 30 (!) times more cost efficient than traditional ones.

  • Motivation is high. Despite a very common stereotype that small and isolated tasks demotivate their performers, we see quite the opposite reaction: programmers are excited when it finally becomes possible to work within well-defined and explicit boundaries, independently, and in isolation. Not all of them, though. My personal observation is: only 25% of them can understand and enjoy micro-tasking. Others are either not professional enough or spoiled by office slavery, where it's possible to do almost nothing and stay very respected and well rewarded.

  • Turnover is not painful. To make micro-tasking possible the management has to learn how to specify them: explicitly, unambiguously, and fast. When such a high level of transparency, formality, and agility (at the same time) is achieved, the project becomes less dependent on experts. We lose the fear of losing people, because almost everything we need to know about the project is inside our task tracking system and project documentation.

  • Easier to parallel. Smaller tasks are easier to delegate to a larger amount of programmers. In some projects we sometimes have over 40 programmers at the same time, while the amount of tasks is relatively small (up to 200).

  • Metrics work. When one programmer closes 40-50 tasks per week and another one closes 5-10, it does mean something, keeping in mind that all tasks are almost equal in size. We use this metric (and a few others) to make organizational and discipline decisions. In a macro-tasking environment almost no HR metrics really work.

  • Quality is enforceable. Large amounts of small tasks implies that we constantly and frequently close them. Each closure is an important point for quality control. That's exactly where we have the ability to say "No" and reject the deliverables that violate our quality standards. With big tasks this "No" is much more painful for programmers.

  • Risks are acceptable. It's impossible to accept the risk of the entire apartment redesign failure, since its cost is too high---a few thousand dollars. However, it is absolutely affordable to accept a risk of a kitchen lamp installation. Even if it falls and breaks, we will spend a few dollars and buy a new one. We don't need to control the "lamp person"---we have a luxury to delegate and trust.

The benefits programmers get overlap with our benefits. If they are professional and motivated enough they find it effective and productive to work with micro tasks, which are always well defined and properly paid.

Are we monkeys or not?

Now the most typical complaint we hear about micro-tasking is: "It is for junior programmers who are OK with being code monkeys." To be honest, I also thought so a few years ago, when we started to experiment with XDSD. What I quickly found out is that the most professional and self-motivated developers were enjoying micro-tasking, while their less mature and less skilled colleagues were finding it difficult to keep up.

© Yegor Bugayenko 2014–2018

Trust. Pay. Lose.

QR code

Trust. Pay. Lose.

  • Moscow, Russia
  • comments

"Listen up, dude," a friend of mine said when he called yesterday, "I trusted them for over a year---we've been partners. They've been programming it all and I was busy doing business development. Now they've quit and I'm left with nothing! What am I supposed to do with all these JavaScript files? How do I even know they are mine? Moreover, they don't even want to cooperate. I feel like a hostage. Please, help me out!" What could I say? "It's too late, dude," was my answer, "but the good news is---you are not the first to have this problem."

The Godfather (1972) by Francis Ford Coppola
The Godfather (1972) by Francis Ford Coppola

"Trust, pay, lose" is what I would call this very typical scenario.

First, you trust your programmers. You call them partners. You believe in them. You are sure that you picked the best ones. They seem to be very reliable. You look at their resumes and feel excited. They know JavaScript, and DevOps, and GitHub, and even Big Data. They definitely are the best. Moreover, they've been in this business for ten years. What else do you need, right?

Second, you pay them. How else would they work, right? True talent is expensive, we all know that. They bill you regularly for the time they spend working on your project. You feel excited to see how your money turns into the software that works. They demonstrate new versions regularly. There are bugs, of course, but this is how it should be, right? They explain everything to you and you keep paying.

Finally, you lose when you realize that it's their software, not yours. They quit because of some business reasons and you're left with nothing. You can't understand those files. You don't even have them, since they are somewhere in the programmers' Git repository. You hire some more people to help you save what's left, but they say that it's time to start everything from scratch. Your frustration is enormous and you're ready to go back to the first step---you trust these new guys, because they definitely seem legit, not like those previous crooks.

Seems familiar?

What is the alternative, you ask?

Don't trust.

Instead, before you start a project, hire an independent expert, who will regularly (ideally, every two weeks) review everything these guys are doing and tell you where and how you may lose. This expert will maintain a Risk List for you. You will take necessary preemptive actions.

Don't trust us programmers. We are smart, lazy and spoiled.

You will lose.

© Yegor Bugayenko 2014–2018

Constructors or Static Factory Methods?

QR code

Constructors or Static Factory Methods?

  • Odessa, Ukraine
  • comments

I believe Joshua Bloch said it first in his very good book "Effective Java": static factory methods are the preferred way to instantiate objects compared with constructors. I disagree. Not only because I believe that static methods are pure evil, but mostly because in this particular case they pretend to be good and make us think that we have to love them.

Extract (2009) by Mike Judge
Extract (2009) by Mike Judge

Let's analyze the reasoning and see why it's wrong, from an object-oriented point of view.

This is a class with one primary and two secondary constructors:

class Color {
  private final int hex;
  Color(String rgb) {
    this(Integer.parseInt(rgb, 16));
  }
  Color(int red, int green, int blue) {
    this(red << 16 + green << 8 + blue);
  }
  Color(int h) {
    this.hex = h;
  }
}

This is a similar class with three static factory methods:

class Color {
  private final int hex;
  static Color makeFromRGB(String rgb) {
    return new Color(Integer.parseInt(rgb, 16));
  }
  static Color makeFromPalette(int red, int green, int blue) {
    return new Color(red << 16 + green << 8 + blue);
  }
  static Color makeFromHex(int h) {
    return new Color(h);
  }
  private Color(int h) {
    this.hex = h;
  }
}

Which one do you like better?

According to Joshua Bloch, there are three basic advantages to using static factory methods instead of constructors (there are actually four, but the fourth one is not applicable to Java anymore):

  • They have names.
  • They can cache.
  • They can subtype.

I believe that all three make perfect sense ... if the design is wrong. They are good excuses for workarounds. Let's take them one by one.

They Have Names

This is how you make a red tomato color object with a constructor:

Color tomato = new Color(255, 99, 71);

This is how you do it with a static factory method:

Color tomato = Color.makeFromPalette(255, 99, 71);

It seems that makeFromPalette() is semantically richer than just new Color(), right? Well, yes. Who knows what those three numbers mean if we just pass them to the constructor. But the word "palette" helps us figure everything out immediately.

True.

However, the right solution would be to use polymorphism and encapsulation, to decompose the problem into a few semantically rich classes:

interface Color {
}
class HexColor implements Color {
  private final int hex;
  HexColor(int h) {
    this.hex = h;
  }
}
class RGBColor implements Color {
  private final Color origin;
  RGBColor(int red, int green, int blue) {
    this.origin = new HexColor(
      red << 16 + green << 8 + blue
    );
  }
}

Now, we use the right constructor of the right class:

Color tomato = new RGBColor(255, 99, 71);

See, Joshua?

They Can Cache

Let's say I need a red tomato color in multiple places in the application:

Color tomato = new Color(255, 99, 71);
// ... sometime later
Color red = new Color(255, 99, 71);

Two objects will be created, which is obviously inefficient, since they are identical. It would be better to keep the first instance somewhere in memory and return it when the second call arrives. Static factory methods make it possible to solve this very problem:

Color tomato = Color.makeFromPalette(255, 99, 71);
// ... sometime later
Color red = Color.makeFromPalette(255, 99, 71);

Then somewhere inside the Color we keep a private static Map with all the objects already instantiated:

class Color {
  private static final Map<Integer, Color> CACHE =
    new HashMap<>();
  private final int hex;
  static Color makeFromPalette(int red, int green, int blue) {
    final int hex = red << 16 + green << 8 + blue;
    return Color.CACHE.computeIfAbsent(
      hex, h -> new Color(h)
    );
  }
  private Color(int h) {
    return new Color(h);
  }
}

It is very effective performance-wise. With a small object like our Color the problem may not be so obvious, but when objects are bigger, their instantiation and garbage collection may waste a lot of time.

True.

However, there is an object-oriented way to solve this problem. We just introduce a new class Palette, which becomes a store of colors:

class Palette {
  private final Map<Integer, Color> colors =
    new HashMap<>();
  Color take(int red, int green, int blue) {
    final int hex = red << 16 + green << 8 + blue;
    return this.computerIfAbsent(
      hex, h -> new Color(h)
    );
  }
}

Now, we make an instance of Palette once and ask it to return a color to us every time we need it:

Color tomato = palette.take(255, 99, 71);
// Later we will get the same instance:
Color red = palette.take(255, 99, 71);

See, Joshua, no static methods, no static attributes.

They Can Subtype

Let's say our class Color has a method lighter(), which is supposed to shift the color to the next available lighter one:

class Color {
  protected final int hex;
  Color(int h) {
    this.hex = h;
  }
  public Color lighter() {
    return new Color(hex + 0x111);
  }
}

However, sometimes it's more desirable to pick the next lighter color through a set of available Pantone colors:

class PantoneColor extends Color {
  private final PantoneName pantone;
  PantoneColor(String name) {
    this(new PantoneName(name));
  }
  PantoneColor(PantoneName name) {
    this.pantone = name;
  }
  @Override
  public Color lighter() {
    return new PantoneColor(this.pantone.up());
  }
}

Then, we create a static factory method, which will decide which Color implementation is the most suitable for us:

class Color {
  private final String code;
  static Color make(int h) {
    if (h == 0xBF1932) {
      return new PantoneColor("19-1664 TPX");
    }
    return new RGBColor(h);
  }
}

If the true red color is requested, we return an instance of PantoneColor. In all other cases it's just a standard RGBColor. The decision is made by the static factory method. This is how we will call it:

Color color = Color.make(0xBF1932);

It would not be possible to do the same "forking" with a constructor, since it can only return the class it is declared in. A static method has all the necessary freedom to return any subtype of Color.

True.

However, in an object-oriented world we can and must do it all differently. First, we would make Color an interface:

interface Color {
  Color lighter();
}

Next, we would move this decision making process to its own class Colors, just like we did in the previous example:

class Colors {
  Color make(int h) {
    if (h == 0xBF1932) {
      return new PantoneColor("19-1664-TPX");
    }
    return new RGBColor(h);
  }
}

And we would use an instance of class Colors instead of a static faсtory method inside Color:

colors.make(0xBF1932);

However, this is still not really an object-oriented way of thinking, because we're taking the decision-making away from the object it belongs to. Either through a static factory method make() or a new class Colors---it doesn't really matter how---we tear our objects into two pieces. The first piece is the object itself and the second one is the decision making algorithm that stays somewhere else.

A much more object-oriented design would be to put the logic into an object of class PantoneColor which would decorate the original RGBColor:

class PantoneColor {
  private final Color origin;
  PantoneColor(Color color) {
    this.origin = color;
  }
  @Override
  public Color lighter() {
    final Color next;
    if (this.origin.hex() == 0xBF1932) {
      next = new RGBColor(0xD12631);
    } else {
      next = this.origin.lighter();
    }
    return new PantoneColor(next);
  }
)

Then, we make an instance of RGBColor and decorate it with PantoneColor:

Color red = new PantoneColor(
  new RGBColor(0xBF1932)
);

We ask red to return a lighter color and it returns the one from the Pantone palette, not the one that is merely lighter in RGB coordinates:

Color lighter = red.lighter(); // 0xD12631

Of course, this example is rather primitive and needs further improvement if we really want it to be applicable to all Pantone colors, but I hope you get the idea. The logic must stay inside the class, not somewhere outside, not in static factory methods or even in some other supplementary class. I'm talking about the logic that belongs to this particular class, of course. If it's something related to the management of class instances, then there can be containers and stores, just like in the previous example above.

To summarize, I would strongly recommend you never use static methods, especially when they are going to replace object constructors. Giving birth to an object through its constructor is the most "sacred" moment in any object-oriented software, don't miss the beauty of it.

© Yegor Bugayenko 2014–2018

Five Features to Make Java Even Better

QR code

Five Features to Make Java Even Better

  • Odessa, Ukraine
  • comments

I stumbled upon this proposal by Brian Goetz for data classes in Java, and immediately realized that I too have a few ideas about how to make Java better as a language. I actually have many of them, but this is a short list of the five most important.

Idiocracy (2006) by Mike Judge
Idiocracy (2006) by Mike Judge

Global Variables. There are Singletons in Java, which, as we all know, are nothing but global variables. Wouldn't it be great to enable global variables in Java and get rid of Singletons. PHP, JavaScript, Ruby and many other languages have them, why doesn't Java? Look at this code:

class User {
  private static User INSTANCE;
  private User() {}
  public static User getInstance() {
    synchronized (User.INSTANCE) {
      if (User.INSTANCE == null) {
        User.INSTANCE = new User();
      }
    }
    return User.INSTANCE;
  }
  public String getName() {
    // return user's name
  }
}

Then, to access it we have to use:

String name = User.getInstance().getName();

This is a Singleton. See how verbose it is? We can simply replace it with a global variable (global is the keyword I'm suggesting we use):

global User user;

And then:

user.getName();

Much less code to write, and way easier to read!

Global Functions and Namespaces

To group static methods together we create utility classes, where we have to define private constructors to prevent their instantiation. Also, we have to remember which particular utility class a static method is in. It's just extra hassle. I'm suggesting we add global functions to Java and optional "namespaces" to group them. Take a look at this utility class:

class TextUtils {
  private TextUtils() {}
  public static String trim(String text) {
    if (text == null) {
      return "";
    }
    return text.trim();
  }
}

Now look at this global function with a namespace:

namespace TextUtils {
  String trim(String text) {
    if (text == null) {
      return "";
    }
    return text.trim();
  }
}

My point is that since we are already using classes as collections of functions, let's make it more convenient. In some applications we won't even need namespaces, just global functions, like in C and C++.

Full Access to Private Attributes and Methods

In order to access a private attribute or a method of an object from outside we have to use the Reflection API. It's not particularly difficult, but it does take a few lines of code, which are not so easy to read and understand:

class Point {
  private int x;
  private int y;
}
Point point = new Point();
Field field = point.getClass().getDeclaredField("x");
field.setAccessible(true);
int x = (int) field.get(point);

I'm suggesting we allow any object to access any of the attributes and methods of another object:

Point point = new Point();
int x = point.x;

Of course, if they are private, the compiler will issue a warning. At compile time you simply ignore the warning and move on. If you really care about encapsulation, pay attention to the warning and do something else. But in most cases programmers will ignore it, since they would happily use the Reflection API anyway.

NULL by Default

It would be convenient to let us call constructors and methods with an incomplete set of arguments. The arguments we don't provide will be set to null by default. Also, when a method has to return something, but there is no return statement, Java should return null. This is almost exactly how it works in PHP, Ruby, and many other languages. I believe it would be a convenient feature for Java monkeys developers too.

We won't need to define so many methods when some of the arguments are optional. Method overloading is very verbose and difficult to understand. Instead, we should have one method with a long list of arguments. Some of them will be provided by the caller, others will be set to null. The method will decide what to do, for example:

void save(File file, String encoding) {
 if (encoding == null) {
   encoding = "UTF-8";
 }
}

Then we just call either save(f) or save(f, "UTF-16"). The method will understand what we mean. We can also make it even more convenient, like it's done in Ruby, providing method arguments by names:

save(file: f, encoding: "UTF-16");

Also, when there is nothing to return, the method must return null by default. Writing return null is just a waste of a code line and doesn't really improve readability. Take a look:

String load(File file) {
 if (file.exists()) {
   return read_the_content();
 }
}

It's obvious from this code that if the file exists, the method loads and returns its content. If not, it returns null, which will be a good indicator for the caller that something is not right and the content of the file is not available.

Getters and Setters

I think it's only obvious that we need this feature: every private attribute must automatically have a setter and a getter. There should be no need to create them, Java will provide them out-of-the-box, just like Kotlin and Ruby do. What is the point of having an attribute if there are no getters and setters to read it and to modify it, right?

With this new feature we'll no longer need the help of Lombok, or IntelliJ IDEA.


Maybe I should turn my ideas into official proposals to JCP. What do you think?

© Yegor Bugayenko 2014–2018

Software Quality Award, 2018

QR code

Software Quality Award, 2018

  • Dnipro, Ukraine
  • comments
badge

This is the forth year of the Software Quality Award. The prize is still the same---$4,096. The rules are still the same. Read on. Previous years are here: 2015, 2016, 2017.

Fill THIS FORM to submit.

Rules:

  • One person can submit only one project.

  • Submissions are accepted until September 1, 2018.

  • I will check the commit history to make sure you're the main contributor to the project.

  • I reserve the right to reject any submission without explanation.

  • All submissions will be published on this page (including rejected ones).

  • Results will be announced October 15, 2018 on this page and by email.

  • The best project will receive $4,096 (I may split this amount among a few projects).

  • Final decisions will be made by me and are not negotiable (although I may invite other people to help me make the right decision).

  • Winners that received any cash prizes in previous years can't submit again.

Each project must be:

  • Open source (in GitHub).

  • At least 10,000 lines of code (cloc without any arguments).

  • At least one year old.

  • Object-oriented (that's the only thing I understand).

The best project is selected using this criteria.

What doesn't matter:

  • Popularity. Even if nobody is using your product, it is still eligible for this award. I don't care about popularity; quality is the key.

  • Programming language. I believe that any language, used correctly, can be applied to design a high-quality product.

  • Buzz and trends. Even if your project is yet another parser of command line arguments, it's still eligible for the award. I don't care about your marketing position; quality is all.

By the way, if you want to sponsor this award and increase the bonus, email me.

© Yegor Bugayenko 2014–2018

Lazy Loading and Caching via Sticky Cactoos Primitives

QR code

Lazy Loading and Caching via Sticky Cactoos Primitives

  • Odessa, Ukraine
  • comments

You obviously know what lazy loading is, right? And you no doubt know about caching. To my knowledge, there is no elegant way in Java to implement either of them. Here is what I found out for myself with the help of Cactoos primitives.

Reality (2012) by Matteo Garrone
Reality (2012) by Matteo Garrone

Let's say we need an object that will encrypt some text. Speaking in a more object-oriented way, it will encapsulate the text and become its encrypted form. Here is how we will use it (let's create tests first):

interface Encrypted {
  String asString() throws IOException;
}
Encrypted enc = new EncryptedX("Hello, world!");
System.out.println(enc.asString());

Now let's implement it, in a very primitive way, with one primary constructor. The encryption mechanism will just add +1 to each byte in the incoming data, and will assume that the encryption won't break anything (a very stupid assumption, but for the sake of this example it will work):

class Encrypted1 implements Encrypted {
  private final String text;
  Encrypted1(String txt) {
    this.text = txt;
  }
  @Override
  public String asString() {
    final byte[] in = this.text.getBytes();
    final byte[] out = new byte[in.length];
    for (int i = 0; i < in.length; ++i) {
      out[i] = (byte) (in[i] + 1);
    }
    return new String(out);
  }
}

Looks correct so far? I tested it and it works. If the input is "Hello, world!", the output will be "Ifmmp-!xpsme\"".

Next, let's say that we want our class to accept an InputStream as well as a String. We want to call it like this, for example:

Encrypted enc = new Encrypted2(
  new FileInputStream("/tmp/hello.txt")
);
System.out.println(enc.toString());

Here is the most obvious implementation, with two primary constructors (again, the implementation is primitive, but works):

class Encrypted2 implements Encrypted {
  private final String text;
  Encrypted2(InputStream input) throws IOException {
    ByteArrayOutputStream baos =
      new ByteArrayOutputStream();
    while (true) {
      int one = input.read();
      if (one < 0) {
        break;
      }
      baos.write(one);
    }
    this.text = new String(baos.toByteArray());
  }
  Encrypted2(String txt) {
    this.text = txt;
  }
  // asString() is exactly the same as in Encrypted1
}

Technically it works, but stream reading is right inside the constructor, which is bad practice. Primary constructors must not do anything but attribute assignments, while secondary ones may only create new objects.

Let's try to refactor and introduce lazy loading:

class Encrypted3 implements Encrypted {
  private String text;
  private final InputStream input;
  Encrypted3(InputStream stream) {
    this.text = null;
    this.input = stream;
  }
  Encrypted3(String txt) {
    this.text = txt;
    this.input = null;
  }
  @Override
  public String asString() throws IOException {
    if (this.text == null) {
      ByteArrayOutputStream baos =
        new ByteArrayOutputStream();
      while (true) {
        int one = input.read();
        if (one < 0) {
          break;
        }
        baos.write(one);
      }
      this.text = new String(baos.toByteArray());
    }
    final byte[] in = this.text.getBytes();
    final byte[] out = new byte[in.length];
    for (int i = 0; i < in.length; ++i) {
      out[i] = (byte) (in[i] + 1);
    }
    return new String(out);
  }
}

Works great, but looks ugly. The ugliest part is these two lines of course:

this.text = null;
this.input = null;

They make the object mutable and they're using NULL. It's ugly, trust me. Unfortunately, lazy loading and NULL references always come together in classic examples. However there is a better way to implement it. Let's refactor our class, this time using Scalar from Cactoos:

class Encrypted4 implements Encrypted {
  private final IoCheckedScalar<String> text;
  Encrypted4(InputStream stream) {
    this(
      () -> {
        ByteArrayOutputStream baos =
          new ByteArrayOutputStream();
        while (true) {
          int one = stream.read();
          if (one < 0) {
            break;
          }
          baos.write(one);
        }
        return new String(baos.toByteArray());
      }
    );
  }
  Encrypted4(String txt) {
    this(() -> txt);
  }
  Encrypted4(Scalar<String> source) {
    this.text = new IoCheckedScalar<>(source);
  }
  @Override
  public String asString() throws IOException {
    final byte[] in = this.text.value().getBytes();
    final byte[] out = new byte[in.length];
    for (int i = 0; i < in.length; ++i) {
      out[i] = (byte) (in[i] + 1);
    }
    return new String(out);
  }

Now it looks way better. First of all, there is only one primary constructor and two secondary ones. Second, the object is immutable. Third, there is still a lot of room for improvement: we can add more constructors which will accept other sources of data, for example File or a byte array.

In a nutshell, the attribute that is supposed to be loaded in a "lazy" way is represented inside an object as a "function" (lambda expression in Java 8). Until we touch that attribute, it's not loaded. Once we need to work with it, the function gets executed and we have the result.

There is one problem with this code though. It will read the input stream every time we call asString(), which will obviously not work, since only the first time will the stream have the data. On every subsequent call the stream will simply be empty. Thus, we need to make sure that this.text.value() executes the encapsulated Scalar only once. All later calls must return the previously calculated value. So we need to cache it. Here is how:

class Encrypted5 implements Encrypted {
  private final IoCheckedScalar<String> text;
  // same as above in Encrypted4
  Encrypted5(Scalar<String> source) {
    this.text = new IoCheckedScalar<>(
      new StickyScalar<>(source)
    );
  }
  // same as above in Encrypted4

This StickyScalar will make sure that only the first call to its method value() will go through to the encapsulated Scalar. All other calls will receive the result of the first call.

The last problem to solve is about concurrency. The code we have above is not thread safe. If I create an instance of Encrypted5 and pass it to two threads, which call asString() simultaneously, the result will be unpredictable, simply because StickyScalar is not thread-safe. There is another primitive to help us out though, called SyncScalar:

class Encrypted5 implements Encrypted {
  private final IoCheckedScalar<String> text;
  // same as above in Encrypted4
  Encrypted5(Scalar<String> source) {
    this.text = new IoCheckedScalar<>(
      new SyncScalar<>(
        new StickyScalar<>(source)
      )
    );
  }
  // same as above in Encrypted4

Now we're safe and the design is elegant. It includes lazy loading and caching.

I'm using this approach in many projects now and it seems convenient, clear, and object-oriented.

© Yegor Bugayenko 2014–2018

Streams vs. Decorators

QR code

Streams vs. Decorators

  • Odessa, Ukraine
  • comments

The Streams API was introduced in Java 8, together with lambda expressions, just a few years ago. I, as a disciplined Java adept, tried to use this new feature in a few of my projects, for example here and here. I didn't really like it and went back to good old decorators. Moreover, I created Cactoos, a library of decorators, to replace Guava, which is not so good in so many places.

La Haine (1995) by Mathieu Kassovitz
La Haine (1995) by Mathieu Kassovitz

Here is a primitive example. Let's say we have a collection of measurements coming in from some data source, they are all numbers between zero and one:

Iterable<Double> probes;

Now, we need to show only the first 10 of them, ignoring zeros and ones, and re-scaling them to (0..100). Sounds like an easy task, right? There are three ways to do it: procedural, object-oriented, and the Java 8 way. Let's start with the procedural way:

int pos = 0;
for (Double probe : probes) {
  if (probe == 0.0d || probe == 1.0d) {
    continue;
  }
  if (++pos > 10) {
    break;
  }
  System.out.printf(
    "Probe #%d: %f", pos, probe * 100.0d
  );
}

Why is this a procedural way? Because it's imperative. Why is it imperative? Because it's procedural. Nah, I'm kidding.

It's imperative because we're giving instructions to the computer about what data to put where and how to iterate through it. We're not declaring the result, but imperatively building it. It works, but it's not really scalable. We can't take part of this algorithm and apply it to another use case. We can't really modify it easily, for example to take numbers from two sources instead of one, etc. It's procedural. Enough said. Don't do it this way.

Now, Java 8 gives us the Streams API, which is supposed to offer a functional way to do the same. Let's try to use it.

First, we need to create an instance of Stream, which Iterable doesn't let us obtain directly. Then we use the stream API to do the job:

StreamSupport.stream(probes.spliterator(), false)
  .filter(p -> p == 0.0d || p == 1.0d)
  .limit(10L)
  .forEach(
    probe -> System.out.printf(
      "Probe #%d: %f", 0, probe * 100.0d
    )
  );

This will work, but will say Probe #0 for all probes, because forEach() doesn't work with indexes. There is no such thing as forEachWithIndex() in the Stream interface as of Java 8 (and Java 9 too). Here is a workaround with an atomic counter:

AtomicInteger index = new AtomicInteger();
StreamSupport.stream(probes.spliterator(), false)
  .filter(probe -> probe != 0.0d && probe != 1.0d)
  .limit(10L)
  .forEach(
    probe -> System.out.printf(
      "Probe #%d: %f",
      index.getAndIncrement(),
      probe * 100.0d
    )
  );

"What's wrong with that?" you may ask. First, see how easily we got into trouble when we didn't find the right method in the Stream interface. We immediately fell off the "streaming" paradigm and got back to the good old procedural global variable (the counter). Second, we don't really see what's going on inside those filter(), limit(), and forEach() methods. How exactly do they work? The documentation says that this approach is "declarative" and each method in the Stream interface returns an instance of some class. What classes are they? We have no idea by just looking at this code.

These two problems are connected. The biggest issue with this streaming API is the very interface Stream---it's huge. At the time of writing there are 43 methods. Forty three, in a single interface! This is against each and every principle of object-oriented programming, starting with SOLID and then up to more serious ones.

What is the object-oriented way to implement the same algorithm? Here is how I would do it with Cactoos, which is just a collection of primitive simple Java classes:

new And(
  new Mapped<Double, Scalar<Boolean>>(
    new Limited<Double>(
      new Filtered<Double>(
        probes,
        probe -> probe != 0.0d && probe != 1.0d
      ),
      10
    ),
    probe -> () -> {
      System.out.printf(
        "Probe #%d: %f", 0, probe * 100.0d
      );
      return true;
    }
  )
).value();

Let's see what's going on here. First, Filtered decorates our iterable probes to take certain items out of it. Notice that Filtered implements Iterable. Then Limited, also being an Iterable, takes only the first ten items out. Then Mapped converts each probe into an instance of Scalar<Boolean>, which does the line printing.

Finally, the instance of And goes through the list of "scalars" and ask each of them to return boolean. They print the line and return true. Since it's true, And makes the next attempt with the next scalar. Finally, its method value() returns true.

But wait, there are no indexes. Let's add them. In order to do that we just use another class, called AndWithIndex:

new AndWithIndex(
  new Mapped<Double, Func<Integer, Boolean>>(
    new Limited<Double>(
      new Filtered<Double>(
        probes,
        probe -> probe != 0.0d && probe != 1.0d
      ),
      10
    ),
    probe -> index -> {
      System.out.printf(
        "Probe #%d: %f", index, probe * 100.0d
      );
      return true;
    }
  )
).value();

Instead of Scalar<Boolean> we now map our probes to Func<Integer, Boolean> to let them accept the index.

The beauty of this approach is that all classes and interfaces are small and that's why they're very composable. To make an iterable of probes limited we decorate it with Limited; to make it filtered we decorate it with Filtered; to do something else we create a new decorator and use it. We're not stuck to one single interface like Stream.

The bottom line is that decorators are an object-oriented instrument to modify the behavior of collections, while streams is something else which I can't even find the name for.

P.S. By the way, this is how the same algorithm can be implemented with the help of Guava's Iterables:

Iterable<Double> ready = Iterables.limit(
  Iterables.filter(
    probes,
    probe -> probe != 0.0d && probe != 1.0d
  ),
  10
);
int pos = 0;
for (Double probe : probes) {
  System.out.printf(
    "Probe #%d: %f", pos++, probe * 100.0d
  );
}

This is some weird combination of object-oriented and functional styles.

© Yegor Bugayenko 2014–2018

Java 9: The Good, The Bad, and Private Interface Methods

QR code

Java 9: The Good, The Bad, and Private Interface Methods

  • Odessa, Ukraine
  • comments

Java 9 was released a few weeks ago. Check the release notes, they include many interesting features. However, I think that not everything is as good as Oracle and Java adepts seem to picture it. I see three trends in the Java world, which are good, bad, and ugly, respectively. Let's start with the good one.

Birdman (2014) by Alejandro G. Iñárritu
Birdman (2014) by Alejandro G. Iñárritu

The Platform

The first trend is an obvious improvement of the platform that compiles Java, packages JARs, and runs the bytecode. It definitely becomes better with every new Java release. Here is a list of improvements Java 9 made, which are very useful, without doubt:

The platform is obviously becoming more mature. This is a good trend.

The JDK

The second trend, which I've observed since Java 6, shows that the JDK, which is essentially a collection of classes and interfaces designed, developed, and maintained by Oracle, gets bigger with every new release. In Java 9 they added and extended, besides others, the following:

Of course some features must be implemented in the JDK itself, like Unicode support (JEP 267), platform-specific Desktop features (JEP 272), Spin-Wait Hints (JEP 285), compact strings (JEP 254), and the process API (JEP 102). Their implementation depends on the underlying platform and has to be provided together with the JVM.

But what is HTTP 2.0 client doing in the JDK, together with JAX-RS, JPA, JAX-WS, JDBC, and many other things that, in my opinion, should stay as far away from Oracle as possible? They are not platform specific and they can be, in a much better way, designed by the open source community as independent packages. Aggregating them under one monster umbrella brand is a mistake, I believe.

I think that big corporations are only killing the software market, instead of making it better, because of the financial and political motives they expose it to. That's exactly what is happening with JDK. Thanks to the Oracle monopoly it lacks flexibility and dynamicity in growth. In other words, we're stuck with what Oracle and its big friends think is right.

Thus, making JDK bigger is a bad trend. Instead, I believe, Oracle would only benefit from making it smaller, delegating everything that is not platform-specific to the open source community, supporting programmers somehow and promoting open and effective standardization processes on the market.

The Language

Java was developed by James Gosling in Sun Microsystems in 1995 as an object-oriented language. There were many concerns about this claim of object-orientation and I'm also not sure that Java is more OO than it is procedural. However it is officially object-oriented.

There were many procedural features inherited by Java from C/C++, since its first version, including static methods, NULL, implementation inheritance, etc. It was not a perfect object-oriented language and it was not going to be one, as I understand it. The key idea was to create something that could be written once and ran anywhere. However the language was a big deal also, not just the JVM. It was simple and sexy.

Java 5 made a serious step forward in 2004 and improved the language by adding generics, for-each loop, varargs, and static import. However, annotations and enumerations were introduced, which helped the language to divert from the object paradigm to something completely different and procedural.

Java 7 added try-with-resource in 2011, which was a good move, in line with the OOP paradigm.

Java 8 added lambda expressions in 2014, which was a great feature, but absolutely irrelevant to OOP. Lambda and Streams API turned Java into a mix of the object, procedural, and functional paradigms. Default methods were also added to interfaces, which turned types into libraries of code. Types into libraries! It's even worse than implementation inheritance, if you ask me.

Now Java 9 made the next "improvement" to interfaces, allowing them to have private methods. Private static methods in types! Can you believe it? What will be the next step? Attributes, in Java 10, I guess.

Also, let's take a look at what was done to some core classes in the JDK, to understand where the language is heading. Just two examples.

Factory methods for collections (JEP 269). Instead of introducing new constructors and allowing us to do this:

List<Integer> list = new ArrayList<>(1, 2, 3);

...in Java 9 they created more static factory methods and made us do this:

List<Integer> list = List.of(1, 2, 3);

"Fewer constructors, more static methods!" seems to be the philosophy of those who introduced this JEP. Needless to say that this is completely against the very spirit of object-oriented programming. Objects must be created by constructors, not static methods, no matter what Joshua Bloch says. Static methods make the moment of operator new usage invisible for us and that's why the code is way less maintainable---we simply don't know exactly what class is instantiated and what the real arguments of its ctor are.

By the way, with Cactoos you can do it the right way:

List<Integer> list = new ListOf(1, 2, 3);

This is OOP.

New methods in InputStream. Three new methods were added to the already over bloated class InputStream: transferTo(), readNBytes(), and readAllBytes(). Now we are supposed to do this, when we want input stream to copy to an output stream:

input.transferTo(output);

It's one of the most typical mistakes young OOP programmers are making: they make their interfaces big. Just because they need more functionality. I guess the interface segregation principle is part of the famous SOLID and is many years old. What's wrong with you, Oracle? What will the next step be? In Java 10 we will also have saveToFile() and printToConsole()? How about emailToAFriend()?

This is how you would do the same with the IOUtils utility class from commons-io:

IOUtils.copy(input, output);

It's not perfect, but it's better. The most object-oriented way is to use objects, not utility classes and static methods. This is how it works in Cactoos:

new LengthOf(new TeeInput(input, output)).length();

This is OOP.


In my opinion, Java is getting uglier, and this is a trend. Does it mean that it's time to quit? No! No matter how ugly you are, we will always love you Java!

© Yegor Bugayenko 2014–2018

ThreeCopies.com---Server-Side Data Backup Service

QR code

ThreeCopies.com---Server-Side Data Backup Service

  • Odessa, Ukraine
  • comments
badge

I have a number of data resources which exist in one place only and which I don't really want to lose. For example, I have a hosted PHP website, and a MySQL database hosted at the same place. I also have a NoSQL database at Amazon DynamoDB, a PostgreSQL database at Heroku, and also... Well, there are many of them. How to back them up was always a question for me.

Main picture

The most straightforward way is to rent a cheap $15/mo server (or use an existing one) and configure Cron to run a custom bash script, which will pull the data from the MySQL database, package it, and upload it to some place where it will be safe, such as Amazon S3 bucket. Then, I would need another script for the PostgreSQL database, and another one for the FTP file archive, etc.

This is actually how I was doing it for years. The drawbacks of this solution were always the same:

  • I needed to pay for the server.
  • I needed to make sure the server was always up and running (Linux is far from reliable).
  • I needed to back up my scripts too.
  • I needed to SSH to the server every time I wanted to change a script, remember where they were, how they start, etc.

The biggest issue is that every single owner of a data source faces exactly the same set of problems. "Why can't I create a hosted solution for these scripts, to help everybody to back up their data," I was asking myself for years. "Well, I can," was the answer just a few weeks ago, and I created ThreeCopies.

It's a very simple hosted executor of bash scripts, which you edit through a web interface. Then one of our servers starts a Docker container (yegor256/threecopies is the image, here is the Dockerfile) and runs your script inside.

The script starts every hour, every day and every week. Hence the name: "three copies." It's good practice for data backup to create separate copies with different regularities. Also, you might want to put different data into different copies. To help your script understand which copy is running at any particular time we pass the $period environment variable into it, with the value of either hour, day, or week.

How your script pulls the data, packages it, and archives it, depends on the data. I created a short cheat sheet for most common scenarios. This is how I backup the MySQL database for thePMP, for example:

# I don't want to back up every hour
if [ "${period}" == "hour" ]; then exit 0; fi

# I dump the entire database into the file
mysqldump --lock-tables=false --host=db.thepmp.com \
  --user=thepmp --password=********* \
  --databases thepmp > thepmp.sql

# I compress the file
tgz="$(date "+%Y-%m-%d-%H-%M").tgz"
tar czf "${tgz}" thepmp.sql

# I upload it to Amazon S3 bucket
echo "[default]" > ~/.s3cfg
echo "access_key=AKIAICJKH*****CVLAFA" >> ~/.s3cfg
echo "secret_key=yQv3g3ao654Ns**********H1xQSfZlTkseA0haG" >> ~/.s3cfg
s3cmd --no-progress put "${tgz}" "s3://backup.yegor256.com/${tgz}"

The output of the script is available through the web interface and this is yet another benefit of this solution. It's easy to monitor what went wrong and restart the script. All logs are available through the browser. No SSH, no terminals.

I would say that it's a light version of AWS Data Pipeline. ThreeCopies does exactly the same, but it's easier to configure, and it's cheaper. I'm charging $0.01 per script execution hour. And I actually charge per second, while AWS always charges for a full hour. For $5.00 you get 500 hours. For example, the script you see above takes about 5 minutes to complete (the database is not huge). If you skip the hourly executions, like I did above, you will consume 170 minutes of server time every month, which will cost you about $0.34 per year! This is much cheaper than a server and its monitoring, I believe.

One more thing before you go. ThreeCopies is written in Java 8 and is open source, find it in GitHub. Feel free to inspect the code, find bugs, and contribute with fixes or improvements.

© Yegor Bugayenko 2014–2018

What Motivates Me as a Programmer

QR code

What Motivates Me as a Programmer

  • Odessa, Ukraine
  • comments

I wrote a number of sarcastic articles about management and motivation, where some traditional and very popular practices were criticized. Now I've decided to think it all over and summarize what actually motivates me as a programmer when I'm working for someone else. Let's say you hire me tomorrow as a Java coder and ask "What do you want us to do for you so that you will be most productive?" This would be my wish list.

300 (2006) by Zack Snyder
300 (2006) by Zack Snyder

The list is in no particular order.

Remote work. I like to be in the office, but I hate it when I have to be there from 9 till 5. It's very important for me to have the ability to work from wherever I want. Most companies declare that, but in reality I will have to "inform" you every time I decide to stay home. Instead, I want to inform you when I decide to visit the office. In other words, my default state should be "not in the office."

Isolation of tasks. I hate being responsible for someone else's mistakes and I'm not really a good team player. I want to solve problems on my own and be responsible for my own successes and failures. That's why clearly defined and isolated tasks motivate me and help me stay focused and interested. I want to see them in writing (as tickets) and I want to know exactly what the definition of done is. Simply put, what should I do in order for a task to be considered as completed?

Responsibility borders. I hate to be afraid, especially if I don't really know what I'm supposed to be afraid of. I want to know what my possible punishments are and when they will occur. I need to know the rules of the game. Say I commit a bug into the code and we lose $100,000. What will happen to me? Or say I don't finish a task by the deadline. Or I don't answer an email. Or I miss a bug during a code review. Or I break the master branch. What are the consequences? Their clear explanation will seriously boost my motivation.

Open source. I'm a big fan of open source. If you are not, I most probably won't like working for you. If your company makes some code open and I am part of that process, that will seriously affect my motivation, because I will achieve two goals at the same time: make money and become more popular in the open source world. Working in purely closed software projects is a demotivating factor for me.

Project visibility. I'd love to see my name close to a project that is visible to the world. And it doesn't necessarily have to be Google or Facebook. Actually, in those companies regular programmers are way less visible than in smaller startups. So, unless you make me VP of Engineering, I won't consider a position in a big company interesting in that respect. The most interesting project would be a small startup with an ambitious goal and high exposure in the media. Being there even as a regular programmer will motivate me a lot.

Clear hierarchy. Yes, I've heard about holacracy, flat and self-managing teams, and other modern ideas. I hate them all. I believe that any management is based on power and force, and the best way to avoid negative aspects of these rather violent concepts is to organize and structure them. Without a clear and well defined hierarchy of roles a team very quickly turns into a snake pit, with politics, backstabbing and behind-the-scenes games. So, if you can't tell me exactly who is my boss and what the chain of command is in the group, I simply won't consider this place seriously and won't be motivated.

No Agile/Scrum, please. Do I need to say anything else here?

Payment structure. I hate to guess about money, I prefer to know the numbers and the logic behind them. I want to know exactly how much I'm going to get and when. I want to know when the numbers will go up and how I can affect that. Also, I'd like to know the payment policy of the company and, ideally, salaries or rates of the people around me. Jealousy, which arises with the surprising information that someone is getting more than I do, doesn't motivate at all, even if my pay is decent. It would be much easier for me if I knew everything from the first day.

Business transparency. I hate working for big ideas, if they are not mine. Mostly because I know that almost all of them fail. Working for a failure and being told that our future is bright doesn't really motivate me, at all. That's why I would expect you to tell me honestly why a meeting with investors took three hours instead of one and why the door was so tightly closed. Also, I would want to know why our CTO quit a few weeks ago and now works for our competitors. I'd like to know our honest situation in the market and why the web traffic stats are going down. In other words, I'm either a slave kept in the dark, or I know the truth and I'm motivated (no matter how ugly the truth is).

Payments per results. I haven't seen this anywhere, except with my own projects, but I believe it's how good teams should be structured: everybody must be paid for results, not per hour/week/month/year. If you want me to be truly motivated you have to invent a payment structure where my paychecks will correspond to my results. I do realize that this may require you to change the entire management system, so I don't absolutely insist. But you have to remember that as long as you pay me only for my time I will try to do my best to steal from you use it for my own benefit.

Career path. I have no problem starting as a junior developer, but I have to know exactly what my future is and when it will happen. I want to become a CTO, no matter what. And it's not about the title. It's about the amount of technical authority and responsibility I will have. I want it all. If I don't see a clear path to achieve that, I will be very demotivated and will treat my job as temporary. I will always be looking for a better place, where it's easier to become a CTO. So it's your job to make that career growth obvious for me. If it will never be possible for me to become the CTO, make that obvious too. The truth is better anyway.

A strong boss. This is probably the most important requirement I would have. I can't work under a weak manager, it will seriously demotivate me from the first day. I will probably write another blog post about what a "strong manager" is, but in a nutshell it's someone who is ready to fight for his or her own ideas, rights, thoughts, decisions, etc. A weak manager is one who is swimming with the current. Working under such a manager is a huge frustration and a waste of time. I will be demotivated and no amount of money will keep me interested.

These things don't matter at all, I won't even ask about them:

  • Mission and vision of the company
  • Business domain
  • Tech stack
  • Location
  • Company size or structure
  • Race, gender, sexual orientation, religious or political beliefs of people in the team
  • Financial status of the company

Of course, I don't think that this list is applicable for everybody. Other programmers may have some other points or may disagree with mine.

P.S. I would most probably stay away from a business involved in something I consider unethical, like corruption, gambling, crime, etc.

© Yegor Bugayenko 2014–2018

Yet Another Evil Suffix For Object Names: Client

QR code

Yet Another Evil Suffix For Object Names: Client

  • Odessa, Ukraine
  • comments

Some time ago we were talking about "-ER" suffixes in object and class names. We agreed that they were evil and must be avoided if we want our code to be truly object-oriented and our objects to be objects instead of collections of procedures. Now I'm ready to introduce a new evil suffix: Client.

Sin noticias de Dios (2001) by Agustín Díaz Yanes
Sin noticias de Dios (2001) by Agustín Díaz Yanes

Let me give an example first. This is what an object with such a suffix may look like (it's a pseudo-code version of the AmazonS3Client from AWS Java SDK):

class AmazonS3Client {
  createBucket(String name);
  deleteBucket(String name);
  doesBucketExist(String name);
  getBucketAcl(String name)
  getBucketPolicy(String name);
  listBuckets();
  // 160+ more methods here
}

All "clients" look similar: they encapsulate the destination URL with some access credentials and expose a number of methods, which transport the data to/from the "server." Even though this design looks like a proper object, it doesn't really follow the true spirit of object-orientation. That's why it's not as maintainable as it should be, for two reasons:

  • Its scope is too broad. Since the client is an abstraction of a server, it inevitably has to represent the server's entire functionality. When the functionality is rather limited there is no issue. Take HttpClient from Apache HttpComponents as an example. However, when the server is more complex, the size of the client also grows. There are over 160 (!) methods in AmazonS3Client at the time of writing, while it started with only a few dozen just a few years hundred versions ago.

  • It is data focused. The very idea of a client-server relationship is about transferring data. Take the HTTP RESTful API of the AWS S3 service as an example. There are entities on the AWS side: buckets, objects, versions, access control policies, etc., and the server turns them into JSON/XML data. Then the data comes to us and the client on our side deals with JSON or XML. It inevitably remains data for us and never really becomes buckets, objects, or versions.

The consequences depend on the situation, but these are the most probable:

  • Procedural code. Since the client returns the data, the code that works with that data will most likely be procedural. Look at the results AWS SDK methods return, they all look like objects, but in reality they are just data structures: S3Object, ObjectMetadata, BucketPolicy, PutObjectResult, etc. They are all Data Transfer Objects with only getters and setters inside.

  • Duplicated code. If we actually decide to stay object-oriented we will have to turn the data the client returns to us into objects. Most likely this will lead to code duplication in multiple projects. I had that too, when I started to work with S3 SDK. Very soon I realized that in order to avoid duplication I'd better create a library that does the job of converting S3 SDK data into objects: jcabi-s3.

  • Difficulties with testing. Since the client is in most cases a rather big class/interface, mocking it in unit tests or creating its test doubles/fakes is a rather complex task.

  • Static problems. Client classes, even though their methods are not static, look very similar to utility classes, which are well known for being anti-OOP. The issues we have with utility classes are almost the same as those we have with "client" classes.

  • Extendability issues. Needless to say, it's almost impossible to decorate a client object when it has 160+ methods and keeps on growing. The only possible way to add new functionality to it is by creating new methods. Eventually we get a monster class that can't be re-used anyhow without modification.

What is the alternative?

The right design would be to replace "clients" with client-side objects that represent entities of the server side, not the entire server. For example, with the S3 SDK, that could be Bucket, Object, Version, Policy, etc. Each of them exposes the functionality of real buckets, objects and versions, which the AWS S3 can expose.

Of course, we will need a high-level object that somehow represents the entire API/server, but it should be small. For example, in the S3 SDK example it could be called Region, which means the entire AWS region with buckets. Then we could retrieve a bucket from it and won't need a region anymore. Then, to list objects in the bucket we ask the bucket to do it for us. No need to communicate with the entire "server object" every time, even though technically such a communication happens, of course.

To summarize, the trouble is not exactly in the name suffix, but in the very idea of representing the entire server on the client side rather than its entities. Such an abstraction is 1) too big and 2) very data driven.

BTW, check out some of the JCabi libraries (Java) for examples of object-oriented clients without "client" objects: jcabi-github, jcabi-dynamo, jcabi-s3, or jcabi-simpledb.

© Yegor Bugayenko 2014–2018

ReHTTP.net---HTTP Repeater

QR code

ReHTTP.net---HTTP Repeater

  • Odessa, Ukraine
  • comments
badge

I faced a problem a few weeks ago with 0pdd.com, one of my web apps that is supposed to receive HTTP requests (known as webhooks) from GitHub: sometimes the app is down, GitHub gets an HTTP error, and never sends the request again. The request simply gets lost. There is absolutely no way to receive it again once the app is back up. I realized that I needed a service mesh between GitHub and my web app, to accept HTTP requests and repeat them later if they can't be delivered immediately.

I created rehttp.net to do exactly that.

See, the URL I've been giving to GitHub is this one:

http://www.0pdd.com/hook/github

From now on a new URL has to be used:

https://www.rehttp.net/p/http://www.0pdd.com/hook/github

It looks very similar, but starts with https://www.rehttp.net/p/. GitHub sends all webhook PUT/POST requests to the ReHTTP server, which stores them in a temporary database (I'm using AWS DynamoDB).

ReHTTP attempts to deliver them immediately. If something goes wrong and the server HTTP response code is not in the 200-299 range, the next attempt is made in about an hour. Then it retries every hour for about a day. If all attempts fail, it abandons the request and that's it.

What is interesting is that now I can see a summary of my API here. I see how many requests were processed successfully over the last 24 hours and how many of them failed. Also, I have this cute button:

badge

And I have a URL for checking the status of the entire API:

https://www.rehttp.net/s?u=http%3A%2F%2Fwww.0pdd.com%2Fhook%2Fgithub

I gave this URL to StatusCake to ping it every five minutes. If and when something goes wrong, StatusCake will email me and drop me a message on the phone.

ReHTTP is absolutely free. It is written in Java and the code is open. See its GitHub repository and contribute if you find any bugs or just want to add a feature.

© Yegor Bugayenko 2014–2018

XCOP --- XML Style Checker

QR code

XCOP --- XML Style Checker

  • Odessa, Ukraine
  • comments

One of the biggest advantages of XML versus many other data formats is that it is human-readable. Well, to some extent, you may say. Indeed, a badly formatted XML document may be rather difficult to digest. I'm not talking about XML validity now, but about its formatting style. Just like we format our Java/Ruby/Python nicely and then check their "prettiness" with static analyzers, we can also check our XML documents. Six years ago I asked the StackOverflow community for such a tool, but unfortunately my question was down-voted and closed (you will need 10K+ reputation to see it). Last week I finally decided to create a tool myself and I called it xcop.

L'appartement (1996) by Gilles Mimouni
L'appartement (1996) by Gilles Mimouni

It's a very simple command line Ruby gem. First, you install it:

$ gem install xcop

And then you ask it to check your XML file, say pom.xml:

$ xcop pom.xml

If the file is not "pretty," xcop will complain and show what's wrong. You can ask xcop to fix the file:

$ xcop --fix pom.xml

Moreover, in most cases you may need your XML files to include a license in their headers, especially if it's open source. To enforce that, just point xcop to the file with the license:

$ xcop --license=LICENSE.txt pom.xml

I believe it's good practice to use xcop together with Checkstyle (for Java files), Rubocop (for Ruby files), and other static analyzers, to ensure that your XML files always look pretty.

Read how you can integrate xcop with Rake, Maven, and other builders. I will appreciate it if you contribute your own integrations too.

© Yegor Bugayenko 2014–2018

To Be Nice or Not to Be Nice?

QR code

To Be Nice or Not to Be Nice?

  • Odessa, Ukraine
  • comments

I stumbled upon this two-year-old article Why It's Safe for Founders to Be Nice, written by Paul Graham (a co-founder of Y Combinator), whom I honestly respect, and I decided to explain why I disagree. Not that I think we shouldn't be nice. Not at all. But I do think that "being nice" is not a solution for organizational, management, marketing, sales, or business development problems. Moreover, in most cases it is actually not safe for founders to be nice.

Scarecrow (1973) by Jerry Schatzberg
Scarecrow (1973) by Jerry Schatzberg

Graham in his article quotes a founder who explains his worries about being "fundamentally soft-hearted and tending to give away too much for free." Then he suggests the founder should not worry too much, because "as long as you build something good enough to spread by word of mouth, you'll have a hyperlinear growth curve." In other words, don't worry about your softness and instead focus on building a great product---your efforts will be appreciated. But will this really work in the modern world?

It will, provided you're a talented mathematician, or a composer, or maybe a writer, where your success doesn't really depend on people close to you, like employees, partners, and investors. However, developing a business is a different story, where success mostly depends on your ability to generate profit, which, by definition, is "taking more and returning less." What kind of a soft heart will be happy to do that?

7 Soft Hearted Mistakes Startup Founders Make perfectly summarizes how softness may become a weakness, if we fail to take it under control. In a nutshell, we are either soft-hearted or successful. This "weakness" is affecting more people every year, since the entire world, especially its male part, is tending to soften up, mostly thanks to the rapidly growing quality of life.

For some of us, who, like that founder, are "fundamentally soft-hearted," doing business and generating profit is a very stressful activity. We have to do what goes against our inner self and take advantage of others. Telling us that it is perfectly safe to "be nice" in this situation is not ethical---this disarms us and makes us vulnerable in front of those who don't have the "disease of soft-heartedness." Is there a better recipe out there to heal our illness?

Even though I'm trying to think of myself as not a weak person, I have plenty of soft-heartedness disease symptoms. For example, I feel guilty when:

  • I fire an employee
  • I negotiate someone down
  • I punish someone
  • I don't pick up the phone
  • I say "No" to an offer
  • I don't trust people
  • I break up with a girl
  • I don't return my mom's calls

Any successful business person, including Paul Graham, who deals with hundreds of startups every year, would tell you that in order to achieve something you have to take many steps that will make many people around you unhappy. You have to fire people, say "No" to them, punish them, never return their phone calls, and rarely trust anyone. But we're "fundamentally soft-hearted" and simply can't do that every day---it's too stressful for us. However, we also want to be successful in business! We don't want to be mathematicians, or composers, or just Java programmers. We want to move up in business. What do we do?

Let me share the recipe I found for myself.

Obviously, we develop soft-heartedness when we grow up, mostly under the influence of our parents. As kids we quickly learn that in order to survive and have something to eat we must make those grown-ups happy, or at least not disappoint them. Later on we call this guilt-driven behavior "soft-heartedness" and become proud of it. But I believe it's unfixable. Those who were traumatized by guilt in their childhood will never be able to offend somebody and walk away without any negative feelings. They are scarred forever.

The only possible way to get rid of guilt is to replace it with a greater guilt. For example, you just bought two ice cream cups and a friend asks you to give him one. You can't say "No" because you would feel guilty for making him unhappy. But you remember that your mom told you to buy two cups and bring both of them home. You would feel even more guilt if you made your mom unhappy. So you say "No" to your friend and run home. Of course, you also can't eat the ice cream yourself, even though you want it---you are afraid to make your mother unhappy.

The same principle may be applied to business. But instead of having a controlling parent you can define your own "rules of doing business." Those rules will be stronger than any particular situation you're facing at any particular moment. For example, you can decide when and why you answer your emails and phone calls, what should happen for your employee to be discharged, how exactly you punish your employees, how your relationship with partners are structured, etc. This document, or set of documents, will be more important for you than any particular person or situation. You will feel much more guilt for disobeying the rules than for making someone unhappy.

At least this is what works for me. Call it self-discipline or a systematic approach to doing business, but in reality it's just a countermeasure against guilt.

To summarize, and to answer the question whether it is safe for a business person to be nice, I would say that it is very unsafe. But not being nice is obviously not a solution either, because anyone asking this question clearly wants to be nice. Thus, the only solution I managed to find is a personal code of conduct, which helps me be effective and not stressed at the same time.

© Yegor Bugayenko 2014–2018

Bitcoin Is Not a Pyramid. Coinbase Is.

QR code

Bitcoin Is Not a Pyramid. Coinbase Is.

  • Odessa, Ukraine
  • comments

In September 2016 I paid Coinbase $1,222 for two BTCs, $611 each. Seven months later, in April 2017, they paid me back $2,490, which was $1,245 for each BTC. My profit before tax was $1,268, over 100% of the investment, in just seven months. Moreover, if I had waited until today, I would have made $6,800 profit instead. Actually, I still have a few BTCs in my Coinbase account and I can make that 750% profit, if I sell now. Should I? The BTC price is over $4,000. Will it go up? Or down? What would you do?

Two and a Half Men (2011) by Chuck Lorre
Two and a Half Men (2011) by Chuck Lorre

Where did that profit come from? How can Coinbase pay me eight times more now than I paid them less than a year ago? Where did they get that cash?

badge

Obviously you know what a Ponzi scheme is, right? If you don't, watch Episode 16 of the 8th Season of Two and a Half Men. In a nutshell, I'm getting my investment and my profit from other people, who are paying Coinbase for my BTCs. If there were no demand, my entire investment would disappear and I would only have the BTC bits and bytes in my virtual wallet. I would lose my $1,222.

Some recent articles call Bitcoin and other altcoins, like Etherium or Litecoin, Ponzi schemes, for example, The Rise of Cryptocurrency Ponzi Schemes in The Atlantic and Bitcoin passes $1,000 but only number that matters is zero in the Financial Times.

Indeed, the key attribute of a pyramid scheme is right there: the product being sold has no value aside from the demand-generated one, just like Twitter stocks (it's a joke). Should we blame Bitcoin for that? I don't think so.

I think that what Coinbase (and similar traders like CoinMama or CEX.io) are doing is definitely a pyramid and must be stopped by the government, sooner rather than later.

Imagine, I start a company tomorrow. I call it YegorBase. It will buy and sell YegorCoins, which I will also create. The price will be $100 a piece. Some of you may buy that, especially because it will be possible to sell them the next day, when the price will be $101. Initially I will create some fuss to promote the product and the trading will begin. I will make my commission.

Will this be legal? I seriously doubt that Stripe, for example, would approve my account if I told them my business plan.

And the problem is not the YegorCoin itself---there is nothing wrong in creating your own crypto-electronic-whatever-product. The problem is that I'll be trading something that has a very questionable value, being completely unregulated by the government. The trading of my YegorCoin, just like shares of stocks, bonds, options, gold, and Bitcoin, must be regulated by the people we're paying our taxes to---the police.

You may say that Coinbase is not issuing those Bitcoins and only trading them---and that's why it's not a pyramid. Not true. In my YegorBase shop I won't need to issue my YegorCoins for very long. As soon as the total volume of YegorCoins is big enough, the situation will control itself and the price will jump and fall somehow. I will make profit on the trading commission and the owners of YegorCoin will hope that they can manage to sell everything before my shop gets busted.

Why the police are still closing their eyes to what Coinbase is doing, I simply can't understand. Maybe because NYSE and A16Z are among their long list of investors?

© Yegor Bugayenko 2014–2018

RAII in Java

QR code

RAII in Java

  • Riga, Latvia
  • comments

Resource Acquisition Is Initialization (RAII) is a design idea introduced in C++ by Bjarne Stroustrup for exception-safe resource management. Thanks to garbage collection Java doesn't have this feature, but we can implement something similar, using try-with-resources.

At Sachem Farm (1998) by John Huddles
At Sachem Farm (1998) by John Huddles

The problem RAII is solving is obvious; have a look at this code (I'm sure you know what Semaphore is and how it works in Java):

class Foo {
  private Semaphore sem = new Semaphore(5);
  void print(int x) throws Exception {
    this.sem.acquire();
    if (x > 1000) {
      throw new Exception("Too large!");
    }
    System.out.printf("x = %d", x);
    this.sem.release();
  }
}

The code is rather primitive and doesn't do anything useful, but you most probably get the idea: the method print(), if being called from multiple parallel threads, will allow only five of them to print in parallel. Sometimes it will not allow some of them to print and will throw an exception if x is bigger than 1000.

The problem with this code is---resource leakage. Each print() call with x larger than 1000 will take one permit from the semaphore and won't return it. In five calls with exceptions the semaphore will be empty and all other threads won't print anything.

What is the solution? Here it is:

class Foo {
  private Semaphore sem = new Semaphore(5);
  void print(int x) throws Exception {
    this.sem.acquire();
    if (x > 1000) {
      this.sem.release();
      throw new Exception("Too large!");
    }
    System.out.printf("x = %d", x);
    this.sem.release();
  }
}

We must release the permit before we throw the exception.

However, there is another problem that shows up: code duplication. We release the permit in two places. If we add more throw instructions we will also have to add more sem.release() calls.

A very elegant solution was introduced in C++ and is called RAII. This is how it would look in Java:

class Permit {
  private Semaphore sem;
  Permit(Semaphore s) {
    this.sem = s;
    this.sem.acquire();
  }
  @Override
  public void finalize() {
    this.sem.release();
  }
}
class Foo {
  private Semaphore sem = new Semaphore(5);
  void print(int x) throws Exception {
    new Permit(this.sem);
    if (x > 1000) {
      throw new Exception("Too large!");
    }
    System.out.printf("x = %d", x);
  }
}

See how beautiful the code is inside method Foo.print(). We just create an instance of class Permit and it immediately acquires a new permit at the semaphore. Then we exit the method print(), either by exception or in the normal way, and the method Permit.finalize() releases the permit.

Elegant, isn't it? Yes, it is, but it won't work in Java.

It won't work because, unlike C++, Java doesn't destroy objects when their scope of visibility is closed. The object of class Permit won't be destroyed when we exit the method print(). It will be destroyed eventually but we don't know when exactly. Most likely it will be destroyed way after all permits in the semaphore got acquired and we get blocked.

There is a solution in Java too. It is not as elegant as the one from C++, but it does work. Here it is:

class Permit implements Closeable {
  private Semaphore sem;
  Permit(Semaphore s) {
    this.sem = s;
  }
  @Override
  public void close() {
    this.sem.release();
  }
  public Permit acquire() {
    this.sem.acquire();
    return this;
  }
}
class Foo {
  private Semaphore sem = new Semaphore(5);
  void print(int x) throws Exception {
    try (Permit p = new Permit(this.sem).acquire()) {
      if (x > 1000) {
        throw new Exception("Too large!");
      }
      System.out.printf("x = %d", x);
    }
  }
}

Pay attention to the try block and to the Closeable interface that the class Permit now implements. The object p will be "closed" when the try block exits. It may exit either at the end, or by the return or throw statements. In either case Permit.close() will be called: this is how try-with-resources works in Java.

I introduced method acquire() and moved sem.acquire() out of the Permit constructor because I believe that constructors must be code-free.

To summarize, RAII is a perfect design pattern approach when you deal with resources that may leak. Even though Java doesn't have it out of the box we can implement it via try-with-resources and Closeable.

© Yegor Bugayenko 2014–2018

How to Manage a Manager?

QR code

How to Manage a Manager?

  • Odessa, Ukraine
  • comments

No secret, we you all have managers. Some of them are great, while many are simply idiots. What do you do if you happen to have a boss that fits perfectly into this dominating category? Quit and try to find a better place? This may sound like good advice, but you know as well as I do that a new boss most likely won't be any better. Don't quit. Stay. Manage the manager. Most of them are manageable.

The Intouchables (2011) by Olivier Nakache
The Intouchables (2011) by Olivier Nakache

First of all, remember your goal: do nothing and get paid. It's hardly achievable to its full extent, but you can get very close. Doing something useful two hours a week and collecting a paycheck for forty hours is what a professional engineer must aim for. The other 38 hours you will spend on your own projects, your open source ideas, your education, your dreams.

The biggest problem on the road to this success is the manager, who is hired exactly to prevent this from happening. Managers use multiple instruments to catch you and force you to give them your time. Here is what I did in these situations against a really annoying manager I once had:

Tasks. Believe it or not he would assign some coding tasks to me. I would do them very slowly or not do them at all. With a serious shortage of programmers on the market and my relatively good profile he wasn't able to fire me. So he had to put up with the fact that I simply didn't write any code, no matter how many tasks were assigned. Very soon he gave up this idea and stopped giving me anything. I basically created an image of a very skilled engineer who didn't write code. No matter how much you asked.

Meetings. At the beginning he was calling me to all possible meetings, because he thought that I was very smart. I was even smarter than he thought: in each meeting I expressed my opinions in a very aggressive and conflict-provoking way. And I always had enough opinions to express. Very soon he stopped calling me to those hours-long meetings because I was simply ruining them, making strong points, and never "being nice." Then, when he stopped calling, I pretended to be offended, as if I really wanted to contribute and yet they were all ignoring me. Guilt is a very powerful management instrument, you know.

Reports. From time to time he was interested to know what was going on, mostly by email or in Slack chat. I always had a very long list of things I was "working on," which were absolutely cryptic to him. He was not a programmer and didn't have enough courage to verify my claims. Any time he asked what I was busy with, I sent him something like "HDFS reconfig for Docker image" or "Integration tests for JAX-RS endpoints." He was happy to see that I was very busy and he left me alone for another week or two. Actually, I would recommend you send such reports to your managers pro-actively, before they even ask. This will make them feel even more comfortable.

Morning Stand-ups. These are annoying and very dangerous, because other programmers may catch your lies about "HDFS and Docker." The best defense is offense: I always acted very interested in what other people were working on. I always asked additional questions too, making them afraid of me. It worked. They never bothered me with their suspicions. Try not to avoid the stand-up meetings---if a manager sees you there they assume that you're actually working.

Advice. He would ask me for technical advice, to help him make his decisions. This is rather risky, because eventually you have to be responsible for the advice you give, especially if you are a lead developer or an architect. The best way to avoid this risk is to transfer responsibility to someone else. I was always trying to ask somebody in the team to help me: to analyze the problem and create a short email/report with pros and cons. Junior programmers are usually very interested in doing such a favor for someone more senior. Then, I just forwarded that email to the manager. Very soon he stopped coming to me for the analysis and asked the junior guys directly.

Emails. Long email threads are very annoying especially in big teams. I never read them. You should not read them either, if you value your time. However, you can't just ignore them because everybody will feel that you are either lazy or a sociopath. Neither of which is in your favor. What I always did was pick up on any message from the thread and reply to it, with a question. It's called trolling. You provoke others to keep the conversation going, even though you're not interested in it at all. A few emails like that a day and people will think that you're on top of everything in the team.

Coaching. My manager would ask me every now and then to train new programmers and to help them out. This too was very risky, since the new guys usually decided that I was their friend and would talk to me about everything, consuming my time. To prevent this from happening I would always try to introduce them to somebody else---their new friend. Everybody, if they don't understand the consequences, is happy to talk to junior programmers and to patronize them. I just had to forward those juniors immediately to the right person.

Personal meetings. This was the most annoying part of all: face-to-face meetings with the manager. He asked me how happy I was in the team, what my plans were, what problems I saw, etc. I was not able to say "Well, I'm happy that you guys are still paying me and my biggest problem is that you annoy me far too often." Instead I had to invent plans, ideas, problems, and things that I wasn't happy about. I always kept a list of such things ready, in case a manager ever called me for a meeting.

That was my strategy. How do you manage your managers?

© Yegor Bugayenko 2014–2018

My Favorite Websites

QR code

My Favorite Websites

  • Odessa, Ukraine
  • comments

I recently published a summary of the software and hardware I'm using every day. Now I'll list my most favorite websites and online services, which help me do my daily job: write code and manage projects.

icon icon icon
Gmail is the best email system. It aggregates all my accounts in one place: all emails are coming to the @gmail.com address and I have a few aliases for sending responses. Google Calendar helps me keep my plans in sync, and Google Drive helps me share documents with friends sometimes.

icon
GitHub is where we keep our source code repositories. I was also using Bitbucket a few years ago, but didn't really like it. GitHub is simply the best, no doubts.

icon
StackOverflow is where I learn new technologies, ask questions, answer them sometimes and really enjoy working with the community. When I have some free time I just search by "design-patterns" or "oop" tags in order to find relevant questions and answer them if I can.

icon
Wring aggregates my GitHub notifications and helps me stay on top of all events that are happening there. It's my own project and I use it every day.

icon
Rultor is helping me merge pull requests, deploy my projects to production and release their new versions. It's one of my pet projects and I'm using it every day.

icon
Heroku is where I host almost all my Java and Ruby projects.

icon icon
Travis for open source and Shippable for commercial projects, as continuous integration platforms. There are many alternatives, but I like these guys.

icon
Godaddy is where I register all domain names. Their user interface is terrible, but their prices are the lowest. So, I'm with them for at least eight years already.

icon icon
Google Analytics and Google Webmasters are two systems where I track my entire web traffic of all my websites. Both systems are terrible, performance and interface wise. But since it's Google, we have to stay with them, to see how they understand web statistics. I'm also using hit.ua.

icon icon
Hacker News and /r/programming at Reddit seem to be the best sources of news for me. I open them a few times a week.

icon icon
Facebook and Twitter are the social networks I'm spamming regularly. They are far from being perfect, but they totally dominate the market. I was using these guys, but they seem to be dead less popular now: LinkedIn, Google+.

icon
Amazon is where I buy everything that is not food or clothes.

icon
AWS is where I keep all my data, my DNS, my servers, and some other things. I'm with them since they launched S3 (in 2006).

icon
Contabo is where I host servers, which I need 24x7. Their prices are great and the service is pretty decent. I'm with them for at least five years.

icon
Papertrail is where all my logs are being aggregated. I tried a few other systems, but these guys seem to be the best. I even was paying them something some time ago, but stopped---their free package seems to be sufficient enough for my volumes.

icon
StatusCake is a website monitoring system that keeps an eye on all my web sites and web apps and notifies me by email and through Pushover ($5).

icon
Sentry is where all my errors are being sent to, from production systems.

icon
Buffer is helping me to spam Twitter and Facebook more effectively. I pay them for the premium package and have no regrets. The service really helps me deliver content to the social networks on schedule.

icon
YouTube is where I keep my video content. I also stream my webinars there.

icon
SoundCloud is where I'm listening to music and keep my own channel with Shift-M podcast and other audio content.

icon
QuickBooks is where I keep all my bookkeeping accounts, for all my companies.

Did I forget anything?

© Yegor Bugayenko 2014–2018

The Bigger Victim of Sexual Harassment

QR code

The Bigger Victim of Sexual Harassment

  • Saint Petersburg, Russia
  • comments

You most probably are aware of the recent sexual harassment scandals in Silicon Valley, which led to serious career problems for Dave McClure (former CEO of 500 Startups), Travis Kalanick (former CEO of Uber), Chris Sacca, and a few others. Let's try to put emotions aside and analyze what's happening and what long-term consequences this panic may have for our male-dominated engineering environment.

Twentynine Palms (2003) by Bruno Dumont
Twentynine Palms (2003) by Bruno Dumont

We will never know what really happened between those male investors and female entrepreneurs behind closed doors, but the gist of the accusation is that the former were making advances towards the latter, where it was clearly inappropriate.

What does that mean exactly? What, for example, did Dave McClure do in order to be kicked out of the company he founded?

Let's see. According to Katie Benner from The New York Times, he sent this message to Sarah Kunst, who was an applicant at his incubator in 2014 (other "harassers" were accused of doing something very similar):

I was getting confused figuring out whether to hire you or hit on you.

Obviously he was flirting. And what's wrong with that? Can't we flirt any more? Is it a crime all of a sudden? One thing is wrong though. It was a threat.

Even though this message didn't say it explicitly, it actually implied that "you either go along and we go out, or you won't be hired." Of course, "no hire" may not hurt as much as a knife or a fist in the face, but it does hurt. Especially if you are a woman, a member of a minority group.

I'm a sexist gentleman, I feel bad when I see women being threatened. I want to protect them and I expect our system of justice to help me, just like it helps when we deal with physical violence. These threats or attempts to threaten must be punished.

However!

There is nothing wrong with the game between the sexes by itself. Just like we don't prohibit sex because we are afraid of rapists, we must not stop asking women out because of our fear of a possible harassment accusation. However, it seems that this is what is happening now: men are becoming afraid of women. Especially powerful men with money and status. They realize that a small incautious step may help them become the next Dave or Travis.

This fear is destroying us.

Thanks to this fear our masculine characteristics become a victim of this dangerous game that society is playing right now, with a noble intent to protect women against verbal and psychological abuse. We men are losing our self-confidence, our self-esteem, our balls courage.

This fear mentally castrates us.

We must be absolutely clear that those harassers are guilty not because they "made advances" towards women who work with them or for them, but because they were threatening those ladies. There is nothing wrong in being attracted to a woman, no matter who she is: your boss, your employee, your schoolmate, or the founder of the business you're investing in. Moreover, there is nothing wrong in making advances, falling in love, having sex and marrying them. Threatening them in order to get all that is very wrong.

Not every flirtation is harassment. Not every man with a sexual instinct is a rapist.

How would I re-phrase that message to Sarah if I had been Dave and had fallen in love with her at first sight? I would probably say:

I was getting confused, figuring out how to isolate my hiring process from my desire to hit on you. How about dinner? I promise, no matter how it ends, it will not affect our hiring decision at all.

I realize that it's not perfect, since she wouldn't be completely comfortable, and still might expect Dave's feelings to affect his hiring decision if it turns out she doesn't like him or simply already has a partner. However, this would seem like a respectful attempt to put the weapon down.

It is difficult to not threaten a woman if you're holding a gun in your hand while asking her out. Money and power is the gun.

Guys, let's be very careful. However, let's not forget how to seduce women and fall in love with them. We have to stay men, especially when some of those among us are weak abusers deserving no respect.

© Yegor Bugayenko 2014–2018

How I Would Re-design equals()

QR code

How I Would Re-design equals()

  • Copenhagen, Denmark
  • comments

I want to rant a bit about Java design, in particular about the methods Object.equals() and Comparable.compareTo(). I've hated them for years, because, no matter how hard I try to like them, the code inside looks ugly. Now I know what exactly is wrong and how I would design this "object-to-object comparing" mechanism better.

L'ultimo capodanno (1998) by Marco Risi
L'ultimo capodanno (1998) by Marco Risi

Say we have a simple primitive class Weight, objects of which represent the weight of something in kilos:

class Weight {
  private int kilos;
  Weight(int k) {
    this.kilos = k;
  }
}

Next, we want two objects of the same weight to be equal to each other:

new Weight(15).equals(new Weight(15));

Here is how such a method may look:

class Weight {
  private int kilos;
  Weight(int k) {
    this.kilos = k;
  }
  public boolean equals(Object obj) {
    if (!(obj instanceof Weight)) {
      return false;
    }
    Weight weight = Weight.class.cast(obj);
    return weight.kilos == this.kilos;
  }
}

The ugly part here is, first of all, the type casting with instanceof. The second problem is that we touch the internals of the incoming object. This design makes polymorphic behavior of the Weight impossible. We simply can't pass anything else to the equals() method, besides an instance of the class Weight. We can't turn it into an interface and introduce multiple implementations of it:

interface Weight {
  boolean equals(Object obj);
}

This code will not work:

class DefaultWeight implements Weight {
  // attribute and ctor skipped
  public boolean equals(Object obj) {
    if (!(obj instanceof Weight)) {
      return false;
    }
    Weight weight = Weight.class.cast(obj);
    return weight.kilos == this.kilos; // error here!
  }
}

The problem is that one object decides for the other whether they are equal. This inevitably leads to a necessity to touch private attributes in order to do the actual comparison.

What is the solution?

This is what I'm offering. Any comparison, no matter what types we are talking about, is about comparing two digital values. Either we compare a weight with a weight, text with text, or a user with a user---our CPUs can only compare numbers. Thus, we introduce a new interface Digitizable:

interface Digitizable {
  byte[] digits();
}

Next, we introduce a new class Comparison, which is the comparison of two streams of bytes (I'm not sure the code is perfect, I tested it here, feel free to improve and contribute with a pull request):

class Comparison<T extends Digitizable> {
  private T lt;
  private T rt;
  Comparison(T left, T right) {
    this.lt = left;
    this.rt = right;
  }
  int value() {
    final byte[] left = this.lt.digits();
    final byte[] right = this.rt.digits();
    int result = 0;
    int max = Math.max(left.length, right.length);
    for (int idx = max; idx > 0; --idx) {
      byte lft = 0;
      if (idx <= left.length) {
        lft = left[max - idx];
      }
      byte rht = 0;
      if (idx <= right.length) {
        rht = right[max - idx];
      }
      result = lft - rht;
      if (result != 0) {
        break;
      }
    }
    return (int) Math.signum(result);
  }
}

Now, we need Weight to implement Digitizable:

class Weight implements Digitizable {
  private int kilos;
  Weight(int k) {
    this.kilos = k;
  }
  @Override
  public byte[] digits() {
    return ByteBuffer.allocate(4)
      .putInt(this.kilos).array();
  }
}

Finally, this is how we compare them:

int v = new Comparison<Weight>(
  new Weight(400), new Weight(500)
).value();

This v will either be -1, 0, or 1. In this particular case it will be -1, because 400 is less than 500.

No more violation of encapsulation, no more type casting, no more ugly code inside those equals() and compareTo() methods. The class Comparison will work with all possible types. All our objects need to do in order to become comparable is to implement Digitizable and "provide" their bytes for inspection/comparison.

This approach is actually very close to the printers I described earlier.

© Yegor Bugayenko 2014–2018

Am I a Sexist?

QR code

Am I a Sexist?

  • Odessa, Ukraine
  • comments

Recently I said a few words in my Telegram group about "women in tech," which led to some negative reaction on Twitter. I believe I owe my readers an explanation. Some of them already got confused and came to me with the question: "If you're so much against slavery, where is this male chauvinism coming from?" Let me explain what's going on. Indeed I am a big fan of freedom, but recent hysteria around gender equality is not helping us to become more free. Instead it is causing quite the opposite effect.

Conversations with Other Women (2005) by Hans Canosa
Conversations with Other Women (2005) by Hans Canosa

When I was a kid my parents and my teachers told me that I had to be a gentleman. That literally meant that I had to treat women with respect and always remember that they were weaker than us men---physically and emotionally.

I had to open the door for them, I was not allowed to fight with them as I did with my male friends, I was punished for cursing in front of them, and many other things. I did all this not only because of what I was taught, but also because I saw that they indeed were weaker. They were physically and emotionally different from us boys. They played dolls, we played wars. They wore pink and white skirts and blouses, we wore shorts and t-shirts. They cried when someone was offending them and we were supposed to be stronger, never to cry, and to protect them. It was always obvious that we were the troublemakers, but also the protectors, who those little creatures eventually one day would marry.

Now it seems that I was raised as a sexist. But do I feel ashamed? Not at all.

Do I believe that gender equality is about making men and women fully equal in all aspects of life? Not at all.

We must not be equal, because we are different.

There are many other categories of people in modern societies who possess special rights or are not allowed to do certain things. Simply put, they experience discrimination, but for their good. For example, kids are not allowed to buy alcohol and watch porn. Senior citizens receive pensions even though they don't work any more. People with disabilities sit in a bus where everybody else is not allowed to. Such inequalities are only making us more human. We need them.

The same is true for the "category" of people I was taught to respect and protect: women. And no matter what extreme feminists say, I will remain the same. I do believe in inequality between men and women. I want to see us as gentlemen and ladies. Not just genderless people.

Thanks to this inequality men find women attractive, fall in love with them, marry them and make kids. Thanks to this inequality many years ago my dad asked my mom out, kissed her, proposed, and brought me into this world.

Now back to the main problem: women in tech. I'm a programmer myself. I write and debug code every day. I also manage programmers and projects. My 20+ years of experience in coding tells me that this job is not fun most of the time. It's hard, it requires a lot of rigid logical thinking, it's rather boring, and it's a constant war against machines and against other programmers who produce unmaintainable and unreadable code.

I don't feel good about sending women, who I was raised to protect and respect, into this war. I also personally don't like the idea of women being police officers, soldiers, surgeons, or firefighters, even though it's not up to me to decide what they do for a living. Those jobs are stressful and dangerous, both physically and emotionally. Not that I believe that women can't take this stress, I just don't want them to suffer. There are plenty of gentlemen who can do that instead.

Do I respect women who write code on a daily basis? Yes, a lot. Because I understand how much stress they have to go through. Would I recommend my girlfriend do the same. I don't think so.

Am I a sexist? Maybe so. But that's how I was raised. That's who I am. And I'm proud of it.

© Yegor Bugayenko 2014–2018

My Work Environment

QR code

My Work Environment

  • Odessa, Ukraine
  • comments

I was asked in my Telegram Group which tools and hardware I use in my daily work. Here is the full list of what I have and even how much I paid for them. Maybe it will be helpful for someone.

icon
MacBook Pro Retina, 15-inch, Late 2013, 2.3GHz/16Gb/512Gb ($2,900) with MacOS Sierra. I bought it over three years ago and don't want to replace with a new one, simply because rumors are its quality is very low. My smartphone is iPhone 6s. For video and podcast recording I'm using Zoom H6 together with Movo LV4-O2 microphone, Sennheiser HD 380 PRO headphones, and SLIK Sprint 150 tripod.

icon
IntelliJ IDEA Ultimate ($499 free) for Java projects. I've got a free open source license from JetBrains, because I'm an active contributor and an author of Takes Framework. If you contribute to open source (and you must), you may do the same, I believe, just email them and ask. These are my settings.jar.

icon
Sublime Text ($70) for all texts I edit, including Ruby, JavaScript, and PHP code. Also, for writing this blog in Markdown and my books in LaTeX Sublime is the editor I use. I was using TextMate ($52) a few years ago, but switched to Sublime because it's just better. I also tried Atom (free) and didn't really like it.

icon
Chrome (free) for web browsing. I also have Safari, FireFox, and Opera but only to verify my websites for cross-browser compatibility, that's all. I'm using these plugins: AdBlock, Block site, Grammarly, Rapportive.

icon
iTerm2 (free) for the command line. It's a replacement of Terminal, with some nice features, which I'm not actually aware of.

icon
Homebrew (free) for package managing. Difficult to imagine Mac OSX without Homebrew. I was using MacPorts in my previous MacBook, but switched to Homebrew and have no regrets.

icon
YourKit ($499 free) for Java profiling. I've got a free license from them, again because I'm a contributor to a few open source projects. Email them if you want to get the same. As I tweeted recently, when I have to use YourKit, I know that I'm doing something very wrong in my code.

icon
HTTP Client ($2) for HTTP requests/responses debugging, when curl is not enough.

icon
Sketch ($99) for editing vector graphics, mainly SVG. Also, it's good for converting SVG into PNG.

icon
Pixelmator ($30) for editing rasterized images, like PNG, GIF, JPEG, etc. The icons you see at this page were created with the help of Pixelmator. It is a perfect alternative to Adobe Photoshop, if you are a programmer, not a graphic designer.

icon
1Password ($40) for keeping all my passwords in one place. I don't know what I would do without this tool. All my passwords, bank accounts, credit cards, and passport scans are there.

icon
Zoom ($150/year) for conference calling and Shift-M.

icon
Reaper ($60) for post-processing of Shift-M episodes.

icon
Things ($50) for long-term planning. I put my long-term plans there and open them once a month. I definitely should use this software more frequently.

icon
Tower ($79) for visual Git manipulations. Even though I'm using Git from the command line only, this tool helps me time to time when I need to go through the history and find out where exactly I broke something.

icon
Transmit ($34) for FTP and AWS S3 file management.

icon
OBS Studio (free) for webinars and video recording.

icon
µTorrent (free) for stealing movies. I find them mostly at rutracker.org or The Pirate Bay. I do realize that stealing is a bad thing and I'm actually strongly against piracy, but most movies are either too expensive or not available for purchase.

icon
VLC (free) for watching those stolen movies.

icon
Vienna (free) for reading RSS feeds, rarely.

icon
iStat Menus ($18) for OSX monitoring.

icon icon icon
Keynote for presentations, Pages for documents sometimes, and Numbers for spreadsheets. I don't remember me paying for them, but maybe I just forgot.

icon icon icon icon
Dropbox (free), iCloud ($1/mo), and Google Drive for storage. I try to keep as little as possible on my laptop and upload everything that is already "history" to my personal AWS S3 bucket.

icon icon icon icon icon icon icon
Telegram, Viber, WhatsApp, Messenger, and Skype (all free, in order of preference) for P2P messaging. Slack for business messaging sometimes. Colloquy for IRC messaging when I need it.

icon
PokerStars (not free at all) for playing poker before falling asleep.

© Yegor Bugayenko 2014–2018

Object-Oriented Declarative Input/Output in Cactoos

QR code

Object-Oriented Declarative Input/Output in Cactoos

  • Dnipro, Ukraine
  • comments
badge

Cactoos is a library of object-oriented Java primitives we started to work on just a few weeks ago. The intent was to propose a clean and more declarative alternative to JDK, Guava, Apache Commons, and others. Instead of calling static procedures we want to use objects, the way they are supposed to be used. Let's see how input/output works in a pure object-oriented fashion.

Disclaimer: The version I'm using at the time of writing is 0.9. Later versions may have different names of classes and a totally different design.

Let's say you want to read a file. This is how you would do it with the static method readAllBytes() from the utility class Files in JDK7:

byte[] content = Files.readAllBytes(
  new File("/tmp/photo.jpg").toPath()
);

This code is very imperative---it reads the file content right here and now, placing it into the array.

This is how you do it with Cactoos:

Bytes source = new InputAsBytes(
  new FileAsInput(
    new File("/tmp/photo.jpg")
  )
);

Pay attention---there are no method calls yet. Just three constructors of three classes that compose a bigger object. The object source is of type Bytes and represents the content of the file. To get that content out of it we call its method asBytes():

bytes[] content = source.asBytes();

This is the moment when the file system is touched. This approach, as you can see, is absolutely declarative and thanks to that possesses all the benefits of object orientation.

Here is another example. Say you want to write some text into a file. Here is how you do it in Cactoos. First you need the Input:

Input input = new BytesAsInput(
  new TextAsBytes(
    new StringAsText(
      "Hello, world!"
    )
  )
);

Then you need the Output:

Output output = new FileAsOutput(
  new File("/tmp/hello.txt")
);

Now, we want to copy the input to the output. There is no "copy" operation in pure OOP. Moreover, there must be no operations at all. Just objects. We have a class named TeeInput, which is an Input that copies everything you read from it to the Output, similar to what TeeInputStream from Apache Commons does, but encapsulated. So we don't copy, we create an Input that will copy if you touch it:

Input tee = new TeeInput(input, output);

Now, we have to "touch" it. And we have to touch every single byte of it, in order to make sure they all are copied. If we just read() the first byte, only one byte will be copied to the file. The best way to touch them all is to calculate the size of the tee object, going byte by byte. We have an object for it, called LengthOfInput. It encapsulates an Input and behaves like its length in bytes:

Scalar<Long> length = new LengthOfInput(tee);

Then we take the value out of it and the file writing operation takes place:

long len = length.value();

Thus, the entire operation of writing the string to the file will look like this:

new LengthOfInput(
  new TeeInput(
    new BytesAsInput(
      new TextAsBytes(
        new StringAsText(
          "Hello, world!"
        )
      )
    ),
    new FileAsOutput(
      new File("/tmp/hello.txt")
    )
  )
).value(); // happens here

This is its procedural alternative from JDK7:

Files.write(
  new File("/tmp/hello.txt").toPath(),
  "Hello, world!".getBytes()
);

"Why is object-oriented better, even though it's longer?" I hear you ask. Because it perfectly decouples concepts, while the procedural one keeps them together.

Let's say, you are designing a class that is supposed to encrypt some text and save it to a file. Here is how you would design it the procedural way (not a real encryption, of course):

class Encoder {
  private final File target;
  Encoder(final File file) {
    this.target = file;
  }
  void encode(String text) {
    Files.write(
      this.target,
      text.replaceAll("[a-z]", "*")
    );
  }
}

Works fine, but what will happen when you decide to extend it to also write to an OutputStream? How will you modify this class? How ugly will it look after that? That's because the design is not object-oriented.

This is how you would do the same design, in an object-oriented way, with Cactoos:

class Encoder {
  private final Output target;
  Encoder(final File file) {
    this(new FileAsOutput(file));
  }
  Encoder(final Output output) {
    this.target = output;
  }
  void encode(String text) {
    new LengthOfInput(
      new TeeInput(
        new BytesAsInput(
          new TextAsBytes(
            new StringAsText(
              text.replaceAll("[a-z]", "*")
            )
          )
        ),
        this.target
      )
    ).value();
  }
}

What do we do with this design if we want OutputStream to be accepted? We just add one secondary constructor:

class Encoder {
  Encoder(final OutputStream stream) {
    this(new OutputStreamAsOutput(stream));
  }
}

Done. That's how easy and elegant it is.

That's because concepts are perfectly separated and functionality is encapsulated. In the procedural example the behavior of the object is located outside of it, in the method encode(). The file itself doesn't know how to write, some outside procedure Files.write() knows that instead.

To the contrary, in the object-oriented design the FileAsOutput knows how to write, and nobody else does. The file writing functionality is encapsulated and this makes it possible to decorate the objects in any possible way, creating reusable and replaceable composite objects.

Do you see the beauty of OOP now?

© Yegor Bugayenko 2014–2018

DynamoDB + Rake + Maven + Rack::Test

QR code

DynamoDB + Rake + Maven + Rack::Test

  • Odessa, Ukraine
  • comments

In SixNines.io, one of my Ruby pet web apps, I'm using DynamoDB, a NoSQL cloud database by AWS. It works like a charm, but the problem is that it's not so easy to create an integration test, to make sure my code works together with the "real" DynamoDB server and tables. Let me show you how it was solved. The code is open source and you can see it in the yegor256/sixnines GitHub repo.

How to bootstrap DynamoDB Local

First, you need to use DynamoDB Local, a command line tool created by AWS exactly for the purposes of testing. You need to start it before your integration tests and stop it afterwards.

To make things simpler I suggest you use jcabi-dynamodb-maven-plugin, a Maven plugin that I made a few years ago. You will need to add pom.xml to your repository and start/stop Maven from a Rakefile, just like I'm doing here:

task :dynamo do
  FileUtils.rm_rf('dynamodb-local/target')
  pid = Process.spawn('mvn', 'install', chdir: 'dynamodb-local')
  at_exit do
    `kill -TERM #{pid}`
  end
  begin
    Dynamo.new.aws.describe_table(table_name: 'sn-endpoints')
  rescue Exception => e
    sleep(5)
    retry
  end
end

First, I'm removing dynamodb-local/target, the directory where Maven keeps its temporary files, to make sure we always start from scratch.

Then, I'm starting mvn install, using Process.spawn, as a background process with pid as its process ID (this won't work in Windows, only Linux/Mac). Then I immediately register an at_exit Ruby hook, which will be executed if Ruby dies for any reason. I'm sure it's obvious why I have to do that---in order to avoid garbage running in the background after Rake is finished or terminated.

Pay attention, I'm using kill -TERM instead of kill -KILL, in order to give Maven a chance to wrap everything up, terminate DynamoDB Local correctly, close its TCP port and exit.

How to check that it's running

Next I'm checking the status of sn-endpoints, one of the tables in the DynamoDB Local. It has to be there if the server is up and running. It will be created by jcabi-dynamodb-maven-plugin according to sn-endpoints.json, its JSON configuration.

Most probably the table won't be ready immediately though, since it takes time to bootstrap Maven, start the server, and create tables there. That's why, if the exception is thrown, I catch it, wait for 5 seconds and try again. I keep trying many times, until the server is ready. Eventually it will be. It takes about 12-15 seconds on my MacBook, which means 2-3 attempts/exceptions.

How to connect to DynamoDB Local

My classes need to know how to connect to the server during integration tests. In production they need to connect to AWS, in testing they have to know about the DynamoDB Local instance I just started. This is what I have in my class Dynamo, which is responsible for the very connection with DynamoDB. Its decision on where to connect is based on the environment variable RACK_ENV, which is set to "test" in test__helper.rb, which is included by rake/testtask in front of all other tests, thanks to the double underscore in its name.

If the environment variable is set to "test", Dynamo takes the connectivity parameters from the YAML file dynamodb-local/target/dynamo.yml created by maven-resources-plugin:copy-resources. The TCP port of the DynamoDB Local database will be there, as well as the DynamoDB authentication key and secret.

How to run the integration tests

This is the easiest part. I just use my objects the way they are supposed to be used in production and they connect to DynamoDB Local instead of AWS.

I'm using Rack::Test in order to test the entire application, via a set of HTTP calls. For example, here I'm trying to render Jeff's user account page. Its HTTP response code is supposed to be 200:

require 'test/unit'
require 'rack/test'
class AppTest < Test::Unit::TestCase
  include Rack::Test::Methods
  def app
    Sinatra::Application
  end
  def test_user_account
    header('Cookie', 'sixnines=jeff')
    get('/a')
    assert_equal(200, last_response.status)
  end

Now you can run the entire test from the command line. You can see how Rultor runs it while releasing a new version: full log. Also, see how it works in Travis. In a nutshell:

  • You call rake in the command line and Rake starts;
  • Rake attempts to run the default task, which depends on test;
  • Rake attempts to run test, which depends on dynamo;
  • Rake, inside test task, runs mvn install in the background with this pom.xml;
  • Maven unpacks DynamoDB Local installation package;
  • Maven reserves a random TCP port and stores its value into ${dynamo.port};
  • Maven saves ${dynamo.port} and key/secret pair info;
  • Maven starts DynamoDB Local, binding it to the reserved TCP port;
  • Rake waits for DynamoDB Local availability on the reserved port;
  • Rake imports all test classes starting from test__helper.rb;
  • Environment variable RACK_ENV is set to "test";
  • Rack::Test attempts to dispatch a web page;
  • Dynamo loads YAML config from dynamo.yml and connects to DynamoDB Local;
  • Rake stops;
  • Ruby terminates Maven and it stops DynamoDB Local.

That's it.

© Yegor Bugayenko 2014–2018

Gluten-Free Management Recipes

QR code

Gluten-Free Management Recipes

  • Moscow, Russia
  • comments

We live in the era of organic food, eco-friendly toilets, zero-emission cars, and harassment-free offices. Our management practices have to keep up---they must be zero-stress, conflict-free, and idiot-friendly. If you're still stuck in the old carrot-and-stick, mediocrity-intolerant, primitive mentality, these recipes will open your eyes.

Dogville (2003) by Lars von Trier
Dogville (2003) by Lars von Trier

Be Positive. You must remember that keeping people happy is more important than their results, effectiveness and productivity. As an organic manager you must never offend your team with any negativity, finger-pointing, or honesty. You must think positively and always find a balance somewhere between "you guys are good" and "you're just fantastic." You must be their full-hearted clown cheerleader! Truth hurts and modern management is not about hurting people, it's about making them happy. You must be a permanent source of good news, funny stories, and smiles, shielding your slaves people from the reality and its threats.

Be Politically Correct. How many latinx are in your team? What about gays and transsexuals? You're not sure? That's wrong. You have to count. You don't know how to segregate bi-sexuals from "normal" people and black females from white males? Well, do the research, there must be some "mischling tests" you could do. You have to learn how to count and pay attention to the numbers. Uber and others already do. Diversity, equality, and tolerance: these metrics are quickly replacing outdated performance and productivity appraisals.

Be Reward-Focused. Trotsky said in 1937 that people would eventually work not out of greed or poverty, but because they just enjoy it. He would be happy to see that his prophesy finally came true: programmers don't work for money any more. They work for challenge, respect, curiosity, recognition, cooperation, and free beer on Fridays. As a modern eco-manager you must stop humiliating them with old-fashioned dollar bills and start inspiring them with these intrinsic rewards and free cookies. You will save some money too.

Be Equal. Holacracy is quickly replacing military-style management hierarchies, which proved to be depressing and hurtful to human feelings. Modern managers don't even like to be called "managers." Instead, they demand we call them ambassadors, curators, cheerleaders, and catalysts. You must be very creative in hiding your actual career ladder position, in order not to irritate and hurt your team. Of course, you will still use fear and guilt to keep your slaves under control, but new eco-titles will help you keep them less aware of what's going on.

Be a Leader. Traditional project management with its critical path method, contingency reserves, cause-and-effect diagrams, work breakdown structures, policies, procedures, and plans, is dead. Agile, Scrum, Kanban, and others, are dancing on its grave, with motivational speeches, burndown rates, sprints, standups, and smoothies. The world doesn't need managers anymore, it needs leaders. And you must be one. PMBOK and RUP on your bookshelf must be replaced with Elon Musk and Peter Thiel biographies.

Be a Committer. Not a Git committer, obviously---that's for junior programmers. You have to commit yourself to the project, just like Tim Cook committed to MacBooks. Technically, this means: staying in the office till late, taking no vacations, wearing a branded t-shirt, never discussing a salary or any money at all, and being a proud patriot, no matter how ugly the product is. In other words, being a good slave and showing your team a good example. Don't worry about the future, when projects fail good committers just commit suicide themselves to new projects.

Be a Friend. Explicit and non-ambiguous work orders are not popular any more---they create tension and destroy team morale. Guilt-driven management is much more subtle and elegant: you turn your subordinates into a family and then say: "I count on you as a friend." Who can refuse a friend? And not even a friend, but a mom. This is who they will associate you with when the guilt kicks in---with their mothers. They won't disobey, won't quit, and will never ask for more money.

Be Responsible. In the brick-and-mortar management era responsibility was measurable by what you earn if you succeed and how you suffer if you fail. Luckily, the world has recently learned that this was a mistake we'd been making for a hundreds years. We don't suffer any more. Being responsible now means something completely different. We wear stickers on our MacBooks, we fall in love with "extremely talented people," we participate, engage, celebrate, and champion excellence. And, of course, now it is always "We" instead of "I": "We are responsible," "It is our project," "We can," "Our goal is to do it together!" Of course, when the project fails you find a new one, with a bigger salary and a new group of extremely talented people, and become very responsible again. Nice, huh?

© Yegor Bugayenko 2014–2018

Why Do You Contribute to Open Source?

QR code

Why Do You Contribute to Open Source?

  • Kiev, Ukraine
  • comments

You probably remember my half-a-year-old article: Why Don't You Contribute to Open Source?. I said there that if you don't have your own OSS projects or don't give anything back to those you're using---something is wrong with you. Now I'm talking to those who actually do contribute without demanding anything back---guys, you're doing it wrong!

The Untouchables (1987) by Brian De Palma
The Untouchables (1987) by Brian De Palma

Open source almost always means free, as in beer---nobody will pay you anything for your pull requests. However, it doesn't mean you're not allowed to make money.

Of course, you will not earn any cash directly by fixing a bug in docker/docker, but you may earn some intangible value, which will be converted to cash later.

Moreover, you must earn it---this is my point. Otherwise, if you contribute out of pure altruism, you will lose the motivation very soon. Not because you're greedy and only money can motivate you, but because without money you won't feel your efforts are being truly appreciated.

The intangible value I'm talking about is, of course, your resume. If you are an active contributor to docker/docker, your resume says so, and you're in the list of contributors---your hourly rate or your salary will definitely be higher.

I would recommend, before becoming a contributor, you ask yourself a question: How will this affect my resume, my profile, and my reputation on the market? Will they put me into the list of contributors? Will they promote me in exchange for my pull requests? Will I be able to mention their name in my next job interview?

Most projects won't do anything for you unless you explicitly ask. I'm a maintainer of over 40 GitHub repositories and at least six of them have over 200 stars (yegor256/tacit, yegor256/takes, yegor256/rultor, yegor256/eo, jcabi/jcabi-aspects, teamed/qulice). If you submit a pull request to any of them, I will just review it, merge and forget your name. I won't do anything else, simply because you didn't ask.

However, if you ask me to put your name and a link to your blog in the list of contributors, I will do it without any hesitation. Moreover, if you do that right inside the pull request you are submitting, I will merge it and your name will go right into the repo, into the very place you find the most suitable.

Over the last six years of my active participation in open source development I've never seen anyone asking me that. Why? I don't know.

© Yegor Bugayenko 2014–2018

Any Program Has an Unlimited Number of Bugs

QR code

Any Program Has an Unlimited Number of Bugs

  • Odessa, Ukraine
  • comments

This may sound strange, but I will prove it: no matter how big or stable a piece of software is, it has an unlimited number of bugs not yet found. No matter how many of them we have already managed to find and fix, there are still too many left to count.

L'amico di famiglia (2006) by Paolo Sorrentino
L'amico di famiglia (2006) by Paolo Sorrentino

Let's take this simple Java method that calculates a sum of two integers as an example:

int sum(int a, int b) {
  return a + b;
}

This simple program has an unlimited number of bugs.

To prove this claim we just need to put two thoughts together:

  • First, a bug is something that compromises the quality of software, which, according to IEEE 610.12-1990, is "the degree to which a system meets specified requirements or user expectations."

  • Second, requirements and expectations may be functional and non-functional. The latter include performance, resilience, robustness, maintainability, and a few dozen other NFRs.

It is obvious that there are at least two variables in this equation that are ambiguous: user expectations and maintainability. We can't be precise about them and that's why the number of bugs they will produce has no limit.

Of course, only a very limited subset of the entire set of bugs has any real business impact. Most of the bugs that exist in a program may stay there even after it is shipped to its users---nobody will ever find them or else the damage they cause to the user experience will be insignificant.

Finally, take a look at the method sum() one more time. How about these bugs:

  • It doesn't handle overflows
  • It doesn't have any user documentation
  • Its design is not object-oriented
  • It doesn't sum three or more numbers
  • It doesn't sum double numbers
  • It doesn't cast long to int automatically
  • It doesn't skip execution if one argument is zero
  • It doesn't cache results of previous calculations
  • There is no logging
  • Checkstyle would complain since arguments are not final

I'm sure you can find many more.

BTW, Glenford J. Myers said something very similar in his book "The Art of Software Testing," which I reviewed earlier.

Bill Hetzel, The Complete Guide to Software Testing (1993): "Some Theoretical Limits to Testing: 'We can never be sure the specifications are correct,' 'No testing system can identify every correct program,' 'We can never be certain that a testing system is correct.' These theoretical limits tell us that there will never be a way to be sure we have a perfect understanding of what a program is supposed to do (the expected or required results) and that any testing system we might construct will always have some possibility of failing. In short, we cannot achieve 100 percent confidence no matter how much time and energy we put into it!"

© Yegor Bugayenko 2014–2018

Single Statement Unit Tests

QR code

Single Statement Unit Tests

  • Odessa, Ukraine
  • comments

Many articles and books have already been written about unit testing patterns and anti-patterns. I want to add one more recommendation which, I believe, can help us make our tests, and our production code, more object-oriented. Here it is: a test method must contain nothing but a single assert.

Bullet (1996) by Julien Temple
Bullet (1996) by Julien Temple

Look at this test method from RandomStreamTest from OpenJDK 8, created by Brian Goetz:

@Test
public void testIntStream() {
  final long seed = System.currentTimeMillis();
  final Random r1 = new Random(seed);
  final int[] a = new int[SIZE];
  for (int i=0; i < SIZE; i++) {
    a[i] = r1.nextInt();
  }
  final Random r2 = new Random(seed);
  final int[] b = r2.ints().limit(SIZE).toArray();
  assertEquals(a, b);
}

There are two parts in this method: the algorithm and the assertion. The algorithm prepares two arrays of integers and the assertion compares them and throws AssertionError if they are not equal.

I'm saying that the first part, the algorithm, is the one we should try to avoid. The only thing we must have is the assertion. Here is how I would re-design this test method:

@Test
public void testIntStream() {
  final long seed = System.currentTimeMillis();
  assertEquals(
    new ArrayFromRandom(
      new Random(seed)
    ).toArray(SIZE),
    new Random(seed).ints().limit(SIZE).toArray()
  );
}
private static class ArrayFromRandom {
  private final Random random;
  ArrayFromRandom(Random r) {
    this.random = r;
  }
  int[] toArray(int s) {
    final int[] a = new int[s];
    for (int i=0; i < s; i++) {
      a[i] = this.random.nextInt();
    }
    return a;
  }
}

If Java had monikers this code would look even more elegant:

@Test
public void testIntStream() {
  assertEquals(
    new ArrayFromRandom(
      new Random(System.currentTimeMillis() as seed)
    ).toArray(SIZE),
    new Random(seed).ints().limit(SIZE).toArray()
  );
}

As you can see, there is only one "statement" in this method: assertEquals().

Hamcrest with its assertThat() and its collection of basic matchers is a perfect instrument to make our single-statement test methods even more cohesive and readable.

There are a number of practical benefits of this principle, if we agree to follow it:

  • Reusability. The classes we will have to create for test assertions will be reusable in other test methods and test cases. Just as, in the example above, ArrayFromRandom could be used somewhere else. Similarly, Hamcrest matchers may and will constitute a library of reusable test components.

  • Brevity. Since it will be rather difficult to create a long test method when it only has a single assert, you and your fellow programmers will inevitably write shorter and more readable code.

  • Readability. With a single assert it will always be obvious what the intent of the test method is. It will start with the intent declaration while all other lower level details will be indented.

  • Immutability. It will be almost impossible to have setters in the production code if test methods have no place for algorithmic code. You inevitably will create immutable objects to make them testable with a single assert.

The biggest benefit we get when this principle is applied to our tests is that they become declarative and object-oriented, instead of being algorithmic, imperative, and procedural.

© Yegor Bugayenko 2014–2018

Monikers Instead of Variables

QR code

Monikers Instead of Variables

  • Odessa, Ukraine
  • comments

If we agree that all local variables must be final, multiple returns must be avoided, and temporal coupling between statements is evil---we can get rid of variables entirely and replace them with inline values and their monikers.

OSS 117: Cairo, Nest of Spies (2006) by Michel Hazanavicius
OSS 117: Cairo, Nest of Spies (2006) by Michel Hazanavicius

Here is the code from Section 5.10 (Algorithms) of my book Elegant Objects:

public class Main {
  public static void main(String... args) {
    final Secret secret = new Secret();
    new Farewell(
      new Attempts(
        new VerboseDiff(
          new Diff(
            secret,
            new Guess()
          )
        ), 5
      ),
      secret
    ).say();
  }
}

Pay attention to the variable secret. It exists here because we need its value twice: first, as a constructor argument for the Diff, second as a constructor argument for the Farewell. We can't inline the value by creating two separate instances of class Secret, because it really has to be the same object---it encapsulates the number that we hide from the user in a number-guessing game.

There could be many other situations where a value needs to be used multiple times while remaining unmodifiable. Why do we still call these values variables if technically they are constants?

I'm suggesting we introduce "monikers" for these values, assigning them through the as keyword. For example:

public class Main {
  public static void main(String... args) {
    new Farewell(
      new Attempts(
        new VerboseDiff(
          new Diff(
            new Secret() as secret,
            new Guess()
          )
        ), 5
      ),
      secret
    ).say();
  }
}

Here new Secret() is the inlined value and secret is its moniker, which we use a few lines later.

It would be great to have this feature in Java, right?

© Yegor Bugayenko 2014–2018

How Does Inversion of Control Really Work?

QR code

How Does Inversion of Control Really Work?

  • Odessa, Ukraine
  • comments

IoC seems to have become the cornerstone concept of many frameworks and object-oriented designs since it was described by Martin Fowler, Robert Martin and others ten years ago. Despite its popularity IoC is misunderstood and overcomplicated all too often.

Le conseguenze dell'amore (2004) by Paolo Sorrentino
Le conseguenze dell'amore (2004) by Paolo Sorrentino

Look at this code:

print(book.title());

It is very straight forward: we retrieve the title from the book and simply give it to the print() procedure, or whatever else it might be. We are in charge, the control is in our hands.

In contrast to this, here is the inversion:

print(book);

We give the entire book to the procedure print() and it calls title() when it feels like it. That is, we delegate control.

This is pretty much everything you need to know about IoC.

Does it have anything to do with dependency injection containers? Well, of course, we could put the book into a container, inject the entire container into print(), let it retrieve the book from the container and then call title(). But that's not what IoC is really about---it's merely one of its perverted usage scenarios.

The main point of IoC is exactly the same as I was proposing in my previous posts about naked data and object friends: we must not deal with data, but instead only with object composition. In the given example the design would be even better if we got rid of the print() procedure altogether and replaced it with an object:

new PrintedBook(book);

That would be pure object composition.

There is not much more to say on this subject; I hope I have cleared it up for you---it is just as simple as that.

© Yegor Bugayenko 2014–2018

A Remote Slave Is Still a Slave

QR code

A Remote Slave Is Still a Slave

  • Moscow, Russia
  • comments

Working remotely is definitely a trend, according to the BLS and my personal observations. "Let them work from home" seems to be the silver bullet for every second startup and even some big companies like Buffer, Automattic, Groove, and many others. However, in most cases, the replacement of a brick-and-mortar office with a virtual one doesn't help companies and their slaves employees become more productive.

Happiness (1998) by Todd Solondz
Happiness (1998) by Todd Solondz

Working from home, also known as working from anywhere, working remotely, or simply telecommuting, usually leads to four radical changes:

  • First, we replace office security cameras on the ceiling with screen monitoring software and/or web cameras that constantly watch us. Upwork does it to its thousands of monkeys freelancers, requiring them to install a background screen capture utility and make sure it is running during their entire working time.

  • Second, we replace meeting rooms with conference software like Zoom or good old Skype, and learn how to make virtual meetings productive. The main principle is similar to the one we use for traditional meetings: Make it fun. Otherwise we will switch to Facebook or PokerStars while the manager is still acting talking in her window.

  • Third, we replace Post-it stickies and Scrum boards with tickets and tasks in Trello or any other task tracker. We keep everybody busy learning new tools and filling them with data.

  • Fourth, and most importantly, we replace office chit-chats with Slack channels or HipChat groups. Instead of being in the office at 9 a.m., everybody will have the ability to demonstrate the level of their presenteeism using emojis.

Your life, as a developer, will be very different on a team that works remotely. It will be full of real freedom. However, there will be certain problems:

It's Harder to Prove Innocence. How does your manager know you're actually working, not watching TV, playing with your kids, or grooming your favorite open source pets? In the office, it's easier; you just sit in front of the monitor and the boss is happy. At home or in a coffee shop, you are guilty by default until you prove the opposite by creating new tickets, posting emojis to Slack, and sending emails with multiple CCs.

It's Harder to Find Information. In the office, you can always ask your question out loud for the most knowledgeable person to help you. Remotely, however, you will have to search Wiki pages, Google Docs, and, god forbid, Git logs. Be prepared for almost nobody to be interested in sharing knowledge or teaching you remotely---what is the point if nobody sees him doing it? You will be on your own in front of that year-old documentation made by someone fired right after writing it up.

It's Harder to Avoid Overtime. In the office, you just walk away with everybody else when time is up. Remotely, you will have phone calls and Slack discussions in the middle of the night on Sunday, especially if the team is distributed across multiple time zones. Everybody will expect 24-7 availability because you're working from home. And changing the status to "Do not disturb" in Skype will only offend them and seriously damage your reputation. In a work-from-home environment, good programmers are those who are always online.

It's Harder to Defend Yourself. You won't know where to expect the next bout of trouble from. In the office, you can visually observe the territory, smell conflict in the air, see how people move, what they talk about, and who is the next victim. Remotely, you're sitting in isolation, extracting pieces of information from Slack. You can easily become the next victim without even noticing it. Simply put, they will stab you in the back while you're writing Java and thinking about a new unit test.

It's Harder Not to Waste Time. It sounds counter-intuitive, but while sitting at home, you waste much more time than in the office. Meetings will be called more frequently, and they will be less effective, thanks to constant issues with scheduling, quality of connection, software failure, and of course a lack of online communication skills. The manager will call more meetings largely in order to convince himself that the team is actually working; hearing your voice many times a day is a perfect confirmation that you are busy and he is a good manager.

On a more serious note, I believe that working remotely only makes things worse when the motivation of the team is still based on hourly pay. A slave working remotely is still a slave. Moreover, this slave is harder to manage and control. First, we have to switch to a results-based payment model and then go remote. Actually, remote work will happen automatically as soon as everybody starts getting their paychecks for the results they deliver, not the time they spend. They will simply stop showing up at the office, but their results will keep coming to the project, without any web cameras or Slack channels.

It's no surprise that many big companies have already decided to stop this telecommuting nonsense and put their slaves back behind closed doors; take Google, Yahoo, and Best Buy for example.

© Yegor Bugayenko 2014–2018

SixNines.io, Your Website Availability Monitor

QR code

SixNines.io, Your Website Availability Monitor

  • Moscow, Russia
  • comments
badge

Availability is a metric that demonstrates how often your website is available to its users. Technically, it's a ratio between the number of successful attempts to open the website and the number of failed ones. If one out of a hundred attempts failed, we can say the availability is 99 percent. High-quality websites aim for so-called "six nines" high availability, so named by the number of 9s in the ratio: 99.9999 percent. We created a service that helps you measure this metric and demonstrate its value to your users: SixNines.

Main picture

All you need to do is log in using your GitHub account and register your URI, which SixNines will validate. This operation will cost you $4.95 (processed by Stripe). It's a once-in-a-lifetime fee (if you don't have any money, email us and we'll figure something out). As soon as your URI is in the system, we will monitor it forever.

The key problem with the availability metric is that it takes a lot of time in order to really guarantee a figure of "six nines." Indeed, look at the ratio: 99.9999 percent means that only one attempt out of a million fails (provided we make them regularly with equal intervals). In order to ping your website at least a million times, we would have to work for 694 days and ping every minute. This would take more than two years.

Only after those two years of 24-7 work would we be able to prove that your website is of "high availability" and the value of the metric really is equal to 99.9999 percent (you would have all kinds of reasons to be proud of it).

You can see some big websites being checked by SixNines already:

Google
Google at SixNines

Facebook
Facebook at SixNines

Twitter
Twitter at SixNines

Yahoo
Yahoo at SixNines

Instagram
Instagram at SixNines

And this is my blog's availability (it's hosted on GitHub Pages):

Yegor256 at SixNines

As you see, their numbers are far from "six nines" because we just started to measure them. In two years, we will see what their real availability is.

SixNines is written in Ruby and is open source, hosted in GitHub. Don't hesitate to contribute.

© Yegor Bugayenko 2014–2018

Why I Won't Help You Via Email

QR code

Why I Won't Help You Via Email

  • Moscow, Russia
  • comments

I've been blogging and writing for almost three years now, and a few times a week I get emails or Facebook and Telegram messages from people I don't really know. They ask questions about Java, management, object-oriented programming, and other things they believe I understand and can help them with. Well, my contact details are published right in the header on my blog---what else would I expect, right? True, but even though I always reply to them, I never answer their questions.

Il Divo (2008) by Paolo Sorrentino
Il Divo (2008) by Paolo Sorrentino

Why?

Because it's ineffective.

Honestly, I believe that the more people I help, the better. The bigger my contribution to the community is, the better---for both the community and me personally. That's why I try to answer every single comment on this blog, on my YouTube channel, in our Gitter chat, and on Twitter. I spend at least two hours a day on this activity.

That's why I'm asking everybody who seeks help from me personally to go to one of these public channels and ask there. Then, I will find their questions and answer them. Maybe even someone else will reply, because the community is pretty big now.

When the question is asked publicly, everybody can see our conversation and learn something from it. This is the most effective use of my time and the time of those who are asking.

Thus, you can expect no help provided privately, but all public questions will be answered.

© Yegor Bugayenko 2014–2018

Flexibility Equates to Lower Quality

QR code

Flexibility Equates to Lower Quality

  • Moscow, Russia
  • comments

There are two opposing mindsets: "If it works, it's good" vs. "If it's good, it works;" or "Make it work" vs. "Make it right." I'm talking about the software source code. I've been hearing this almost every day in blog comments: Why do we need all those new OOP principles if our code works just fine without them? What is the point of introducing a new way, which is supposed to be "better," if the existing, traditional, semi-object, semi-procedural, not-so-perfect approach just works?

Scarface (1983) by Brian De Palma
Scarface (1983) by Brian De Palma

Let's try to think bigger. And not just about object-oriented programming, but in general about software development. There are many examples of the "just works" mentality.

Take Perl, a programming language famous for its ability to do anything in three different ways. This means that there is no one "right" way. I'm not a Perl expert; that's why I'll have you look at this Ruby code instead:

if a > b
  m = 'Hello!'
end

We can rewrite it like this:

m = if a > b
  'Hello!'
end

Or this:

m = 'Hello!' if a > b

And one more:

m = a > b ? 'Hello' : nil

Which one is "right?" Are there any Perl developers? Can you suggest some other way to achieve the same result?

Not surprisingly, in Java (a stricter language than Ruby), there is only one way to do it:

if (a > b) {
  m = "Hello!";
}

Well, I guess I'm wrong; there are two, actually. Here is the second one:

if (a > b) m = "Hello!";

What does this variety of options give us, as programmers? I guess the answer seriously depends on what we, the programmers, are doing with the code: writing or reading it. Also, it depends on our attitude toward the software we're creating; we either own it (hacker mentality) or build it (designer mentality).

If we're writing it, and we love to think about ourselves as code owners, we definitely will need that arsenal of syntactic sugar weapons. We need them to prove to ourselves that we're smart and, of course, to show off in front of our friends and that soulless Ruby interpreter.

On the other hand, if we're designers and happen to read the code that is full of sugar, which "just works," we'll be very annoyed and frustrated. Well, maybe I have to speak for myself, but I definitely will be.

This overly-sugared Ruby syntax is a perfect example of "works vs. good" positioning. Ruby philosophy is this: It doesn't matter how you write it, as long as it works. Java philosophy is different; it's much closer to: Make it right and it will work. The weak and dynamic typing in Ruby vs. the strong and static one in Java also prove my point.

In general, I believe that the more flexible the programming language is, the lower the maintainability---the key quality characteristic---of the code it produces. Simply put, higher quality comes from simpler languages.

The same is true for the entire software development: The more restrictions we put on programmers and the fewer options they have for their "syntax creativity," the higher the quality of the software they write. Static analyzers like Checkstyle for Java or Rubocop for Ruby attempt to solve that problem and prohibit us from using certain language features, but they lag far behind. We are very "creative."

Now, let's get back to the original OOP question: Why do we need to improve anything if it works the way it is? Here is the answer: Modern OOP (as in Java, Ruby, and C++) doesn't produce quality code because it doesn't have a strong and properly restricted paradigm. It just has too many "features," which were mostly introduced by C++ and remained there for our mutual "convenience."

They indeed work, but the maintainability of the software we produce is very low. Well, it's way lower than it could be, if our "creativity" would be restricted.

© Yegor Bugayenko 2014–2018

PDD in Action

QR code

PDD in Action

  • Moscow, Russia
  • comments
badge

Puzzle Driven development (PDD) is a methodology we've been practicing on our teams for more than seven years. Using PDD, we delegate the responsibility of task decomposition to its performers, eliminating the role of a project manager. We've been using our proprietary software for that. A month ago, we made it public, open source, and free. It is available as 0pdd---a GitHub-based chat bot.

Main picture

Here is how you configure it, in two steps. First, you grant access to @0pdd in GitHub (if your repository is private).

Second, you add a webhook to your GitHub repository: http://www.0pdd.com/hook/github (with just push event and any content type). I would actually recommend to send GitHub notifications through ReHTTP and use this URL for the webhook: http://p.rehttp.net/http://www.0pdd.com/hook/github.

Now, your repository is being watched by 0pdd. Every time you git push something new, it does a git pull and retrieves your changes from GitHub. Then it runs pdd, a command line tool that scans the entire code base (only the master branch) and finds all occurrences of @todo markers.

For all newly found markers, 0pdd will submit new issues to the GitHub issue-tracking section of your repository.

Also, when you remove markers from your code base, 0pdd will immediately close issues it created.

Now, when an issue is assigned to a programmer, we allow him or her to cut corners and return incomplete code back to the master. If and when the code is not complete, we ask the programmer to leave @todo markers in the code, called "puzzles." Later, these puzzles will be assigned to other programmers, and so on. Eventually, the problem will be fixed when most puzzles are resolved.

0pdd helps you automate this process and provides a summary report of the current situation with all puzzles in the entire code base. You can even add a nice badge to your GitHub repo:

PDD status

If you click it, you will see the full report of all puzzles currently present and previously seen.

This mechanism helps us in many projects. You can use it for free. It's an open source Ruby product; feel free to contribute.

© Yegor Bugayenko 2014–2018

SOLID Is OOP for Dummies

QR code

SOLID Is OOP for Dummies

  • Kharkiv, Ukraine
  • comments

You definitely know the SOLID acronym. It stands for five principles of object-oriented programming that, if followed, are supposed to make your code both legible and extensible. They were introduced almost 30 years ago, but have they really made us better programmers in the time since? Do we really understand OOP better thanks to them? Do we write more "legible and extensible" code? I don't think so.

Dumb & Dumber (1994) by Peter Farrelly
Dumb & Dumber (1994) by Peter Farrelly

Let's go one by one and see how they "help."

S

The "S" refers to the Single Responsibility Principle, which, according to Clean Code by Robert Martin, means that "a class should have only one reason to change."

This statement sounds extremely vague to me, but the book explains it, stating that objects must be problem-centered and responsible for "one thing." It's up to us to decide what that one thing is, of course.

This is what we know as "high cohesion" since Larry Constantine wrote about it in the IBM Systems Journal in 1974. Why was it necessary to create a new principle 15 years later with an ambiguous name and a very questionable definition?

O

This letter is about the Open/Close Principle, which was introduced by Bertrand Meyer in Object Oriented Software Construction in 1988. Simply put, it means that an object should not be modifiable. I can't agree more with this.

But then it says it should be extendable, literally through implementation inheritance, which is known as an anti-OOP technology. Thus, this principle is not really applicable to objects and OOP. It may work with modules and services, but not with objects.

L

The third letter is for the Liskov Substitution Principle, which was introduced by Barbara Liskov in 1987. This one is the most innocent part in the SOLID pentad. In simple words, it states that if your method expects a Collection, an ArrayList will work.

It is also known as subtyping and is the foundational component of any object-oriented language. Why do we need to call it a principle and "follow" it? Is it at all possible to create any object-oriented software without subtyping? If this one is a principle, let's add "variables" and "method calling" here too.

Honestly, I suspect that this principle was added to SOLID mostly in order to somehow fill the gap between "SO" and "ID."

I and D

I guess they both were introduced by Robert Martin while he was working at Xerox.

The Interface Segregation Principle states that you must not declare List x if you only need Collection x or even Iterable x. I can't agree more. Let's see the next one.

The Dependency Inversion Principle means that instead of ArrayList x, you must declare List x and let the provider of the object decide whether it is ArrayList or LinkedList. This one also sounds reasonable to me.

However, how is all this different from the good old "loose coupling" introduced together with cohesion by Constantine in 1974? Do we really need to simplify and blur in order to learn better? No, not to learn better, but to sell better. Here goes my point.

My point is...

The point being these principles are nothing but an explanation of "cohesion and coupling" for dummies in a very primitive, ambiguous, and marketable way. Dummies will buy books, seminars, and trainings, but won't really be able to understand the logic behind them. Do they really need to? They are just monkeys coders, right?

"But an object must be responsible for one thing!" is what I often hear at conferences. People learn that mantra without even knowing what cohesion is nor understanding what this "one thing" they are praying for really is. There is no such thing as "one thing," guys! There are different levels of cohesion.

Who is guilty? Uncle Bob & Co.

They are no better than Ridley Scott and other Hollywood money makers who deliver primitive and easy-to-cry-at movies just to generate a profit. People are getting dumber by watching---but this is not of their concern. The same happens with magic OOP principles---programmers rely on them, thinking the truth is right there while the real truth is not understood even by the creators of this "magic."

SOLID is a money-making instrument, not an instrument to make code better.

© Yegor Bugayenko 2014–2018

The TDD That Works for Me

QR code

The TDD That Works for Me

  • Frankfurt, Germany
  • comments

Test-driven development (a.k.a. TDD) was rediscovered by Kent Beck and explained in his famous book in 2002. In 2014, David Heinemeier Hansson (the creator of Ruby on Rails) said that TDD is dead and only harms architecture. Robert Martin (the inventor of the SOLID principles) disagreed and explained that TDD may not work only in certain cases. A few days later, he even compared the importance of TDD with the importance of hand-washing for medicine, and added that "it would not surprise me if, one day, TDD had the force of law behind it." Two years later, now just a few months ago, he wrote more about it, and more, and more. This subject seems to be hot. Of course, I have my own take on it; let me share.

La grande bellezza (2013) by Paolo Sorrentino
La grande bellezza (2013) by Paolo Sorrentino

In theory, TDD means "writing tests first and code next." In practice, according to my experience while working with more than 250 developers over the last four years, it means writing tests when we're in a good mood and have nothing else to do. And this is only logical, if we understand TDD literally, by the book.

Writing a test for a class without having that class in front of you is difficult. I would even say impossible, if we are talking about real code, not calculator examples. It's also very inefficient, because tests by definition are much more rigid than the code they validate---creating them first will cause many re-do cycles until the design is stabilized.

I've personally written almost 300,000 lines of code in Java, Ruby, PHP, and JavaScript over the last four years, and I have never done TDD by the book: "write a test, make it run, make it right." Ever.

Code, Deploy, Break, Test, Fix

Even though I'm a huge fan of automated testing (unit or integration) and totally agree with Uncle Bob: Those who don't write tests must be put in jail, I just have my own interpretation of TDD. This is how it looks:

  • First, I write code without any tests. A lot of code. I implement the functionality and create the design. Dozens of classes. Of course, the build is automated, the deployment pipeline is configured, and I can test the product myself in a sandbox. I make sure "it works on my machine."

  • Then, I deploy it to production. Yes, it goes to my "users" without any tests because it works for me. They are either real users if it's something open source or one of my pet projects, or manual testers if it's a money project.

  • Then, they break it. They either test it or they use it; it doesn't matter. They just find problems and report bugs. As many as they can.

  • Right after some bugs are reported, I pick the most critical of them and...voilà!...I create an automated test. The bug is a message to me that my tests are weak; I have to fix them first. A new test will prove that the code is broken. Or maybe I fix an existing one. This is where I go "tests first." I don't touch the production code until I manage to break my build and prove the problem's existence with a new test. Then, I do git commit.

  • Finally, it's time to fix the problem. I make changes to the production code in order to make sure the build is green again. Then, I do git commit and git push. And I go back to the "deploy" step; the updated product goes to my users.

Once in a while, I have to make serious modifications to the product, like to introduce a new feature or perform a massive refactoring. In this case, I go back to the first step and do it without tests.

The Reasoning Behind

The justification behind this no-tests-upfront approach is simple: We don't need to test until it's broken, mostly because we understand that it's technically not possible to test everything or to fix all bugs. We have to fix only what's visible and intolerable by the business. If the business doesn't care or our users/testers don't see our bugs---we must not waste project resources on fixing them.

On the other hand, when the business or our users/testers are complaining, we have to be very strict with ourselves; our testing system is weak and must be fixed first. We can't just fix the production code and deploy, because in this case, we may make this mistake again after some refactoring, and our tests won't catch it. The user will find the bug again, and the business will pay us again to fix it. That will be the waste of resources.

As you can see, it's all money-driven. First, don't fix anything if nobody pays for it. Second, fix it once and for all if they actually paid. It's as simple as that.

The Dynamics

Thanks to this test-and-fix-only-when-broken approach, the balance between production code and test code is not the same over the entire project lifecycle. When the project starts, there are almost no tests. Then, the number of tests grows together with the number of bugs. Eventually, the situation stabilizes, and we can move the product from beta version to the first release.

I created a simple command line tool in order to demonstrate the statistics from a few projects of mine, to prove my point. Take a look at these graphs:

yegor256/takes (Web framework, Java):

yegor256/xembly (XML builder, Java):

jcabi/jcabi-aspects (AOP library, Java):

yegor256/s3auth (S3 gateway, Java):

First commercial project:

Second commercial project:

In each graph, there are two parts. The first one on the top demonstrates the dynamics of production Hits-of-Code (green line), test-related HoC (red line), and the number of issues reported to GitHub (orange line).

The bottom part shows how big the test-related HoC portion is relative to all project activity. In other words, it shows how much effort the project invested into automated tests, compared with the total effort.

This is what I want you to pay attention to: The shape of the curve is almost the same in every project. It looks very similar to a learning curve, where we start to learn fast and then slow down over time:

The figure

This perfectly illustrates what I just described above. I don't need tests at the beginning of the project; I create them later when my users express the need for them by reporting bugs. This dynamic looks only logical to me.

You can also analyze your project using my tool and see the graph. It would be interesting to learn what kind of curve you will get.

© Yegor Bugayenko 2014–2018

Traits and Mixins Are Not OOP

QR code

Traits and Mixins Are Not OOP

  • Odessa, Ukraine
  • comments

Let me say right off the bat that the features we will discuss here are pure poison brought to object-oriented programming by those who desperately needed a lobotomy, just like David West suggested in his Object Thinking book. These features have different names, but the most common ones are traits and mixins. I seriously can't understand how we can still call programming object-oriented when it has these features.

Fear and Loathing in Las Vegas (1998) by Terry Gilliam
Fear and Loathing in Las Vegas (1998) by Terry Gilliam

First, here's how they work in a nutshell. Let's use Ruby modules as a sample implementation. Say that we have a class Book:

class Book
  def initialize(title)
    @title = title
  end
end

Now, we want class Book to use a static method (a procedure) that does something useful. We may either define it in a utility class and let Book call it:

class TextUtils
  def self.caps(text)
    text.split.map(&:capitalize).join(' ')
  end
end
class Book
  def print
    puts "My title is #{TextUtils.caps(@title)}"
  end
end

Or we may make it even more "convenient" and extend our module in order to access its methods directly:

module TextModule
  def caps(text)
    text.split.map(&:capitalize).join(' ')
  end
end
class Book
  extend TextModule
  def print
    puts "My title is #{caps(@title)}"
  end
end

It seems nice---if you don't understand the difference between object-oriented programming and static methods. Moreover, if we forget OOP purity for a minute, this approach actually looks less readable to me, even though it has fewer characters; it's difficult to understand where the method caps() is coming from when it's called just like #{caps(@title)} instead of #{TextUtils.caps(@title)}. Don't you think?

Mixins start to play their role better when we include them. We can combine them to construct the behavior of the class we're looking for. Let's create two mixins. The first one will be called PlainMixin and will print the title of the book the way it is, and the second one will be called CapsMixin and will capitalize what's already printed:

module CapsMixin
  def to_s
    super.to_s.split.map(&:capitalize).join(' ')
  end
end
module PlainMixin
  def to_s
    @title
  end
end
class Book
  def initialize(title)
    @title = title
  end
  include CapsMixin, PlainMixin
  def print
    puts "My title is #{self}"
  end
end

Calling Book without the included mixin will print its title the way it is. Once we add the include statement, the behavior of to_s is overridden and method print produces a different result. We can combine mixins to produce the required functionality. For example, we can add one more, which will abbreviate the title to 16 characters:

module AbbrMixin
  def to_s
    super.to_s.gsub(/^(.{16,}?).*$/m,'\1...')
  end
end
class Book
  def initialize(title)
    @title = title
  end
  include AbbrMixin, CapsMixin, PlainMixin
  def print
    puts "My title is #{self}"
  end
end

I'm sure you already understand that they both have access to the private attribute @title of class Book. They actually have full access to everything in the class. They literally are "pieces of code" that we inject into the class to make it more powerful and complex. What's wrong with this approach?

It's the same issue as with annotations, DTOs, getters, and utility classes---they tear objects apart and place pieces of functionality in places where objects don't see them.

In the case of mixins, the functionality is in the Ruby modules, which make assumptions about the internal structure of Book and further assume that the programmer will still understand what's in Book after the internal structure changes. Such assumptions completely violate the very idea of encapsulation.

Such a tight coupling between mixins and object private structure leads to nothing but unmaintainable and difficult to understand code.

The very obvious alternatives to mixins are composable decorators. Take a look at the example given in the article:

Text text = new AllCapsText(
  new TrimmedText(
    new PrintableText(
      new TextInFile(new File("/tmp/a.txt"))
    )
  )
);

Doesn't it look very similar to what we were doing above with Ruby mixins?

However, unlike mixins, decorators leave objects small and cohesive, layering extra functionality on top of them. Mixins do the opposite---they make objects more complex and, thanks to that, less readable and maintainable.

I honestly believe they are just poison. Whoever invented them was a long ways from understanding the philosophy of object-oriented design.

© Yegor Bugayenko 2014–2018

How to Handle the Problem of Too Many Classes

QR code

How to Handle the Problem of Too Many Classes

  • Odessa, Ukraine
  • comments

During nearly every presentation in which I explain my view of object-oriented programming, there is someone who shares a comment like this: "If we follow your advice, we will have so many small classes." And my answer is always the same: "Of course we will, and that's great!" I honestly believe that even if you can't consider having "a lot of classes" a virtue, you can't call it a drawback of any truly object-oriented code either. However, there may come a point when classes become a problem; let's see when, how, and what to do about that.

El día de la bestia (1995) by Álex de la Iglesia
El día de la bestia (1995) by Álex de la Iglesia

There were a number of "rules" previously mentioned that, if applied, would obviously lead to a large number of classes, including: a) all public methods must be declared in interfaces; b) objects must not have more than four attributes (Section 2.1 of Elegant Objects); c) static methods are not allowed; d) constructors must be code-free; e) objects must expose fewer than five public methods (Section 3.1 of Elegant Objects).

The biggest concern, of course, is maintainability: "If, instead of 50 longer classes, we had 300 shorter ones, then the code would be way less readable." This will most certainly happen if you design them wrong.

Types (or classes) in OOP constitute your vocabulary, which explains the world around your code---the world your code lives in. The richer the vocabulary, the more powerful your code. The more types you have, the better you can understand and explain the world.

If your vocabulary is big enough, you will say something like:

Read the book that is on the table.

With a much smaller vocabulary, the same phrase would sound like:

Do it with the thing that is on that thing.

Obviously, it's easier to read and understand the first phrase. The same occurs with types in OOP: the more of them you have at your disposal, the more expressive, bright, and readable your code is.

Unfortunately, Java and many other languages are not designed with this concept in mind. Packages, modules, and namespaces don't really help, and we usually end up with names like AbstractCookieValueMethodArgumentResolver (Spring) or CombineFileRecordReaderWrapper (Hadoop). We're trying to pack as many semantics into class names as possible so their users won't doubt for a second. Then we're trying to put as many methods into one class as possible to make life easier for users; they will use their IDE hints to find the right one.

This is anything but OOP.

If your code is object-oriented, your classes must be small, their names must be nouns, and their method names must be just one word. Here is what I do in my code to make that happen:

Interfaces are nouns. For example, Request, Directive, or Domain. There are no exceptions. Types (also known as interfaces in Java) are the core part of my vocabulary; they have to be nouns.

Classes are prefixed. My classes always implement interfaces. Thanks to that, I can say they always are requests, directives, or domains. And I always want their users to remember that. Prefixes help. For example, RqBuffered is a buffered request, RqSimple is a simple request, RqLive is a request that represents a "live" HTTP connection, and RqWithHeader is a request with an extra header.

An alternative approach is to use the type name as the central part of the class name and add a prefix that explains implementation details. For example, DyDomain is a domain that persists its data in DynamoDB. Once you know what that Dy prefix is for, you can easily understand what DyUser and DyBase are about.

In a medium-sized application or a library, there will be as many as 10 to 15 prefixes you will have to remember, no more. For example, in the Takes Framework, there are 24,000 lines of code, 410 Java files, and 10 prefixes: Bc, Cc, Tk, Rq, Rs, Fb, Fk, Hm, Ps, and Xe. Not so difficult to remember what they mean, right?

Among all 240 classes, the longest name is RqWithDefaultHeader.

I find this approach to class naming rather convenient. I used it in these open source projects (in GitHub): yegor256/takes (10 prefixes), yegor256/jare (5 prefixes), yegor256/rultor (6 prefixes), and yegor256/wring (5 prefixes).

© Yegor Bugayenko 2014–2018

Why I Don't Talk to Google Recruiters

QR code

Why I Don't Talk to Google Recruiters

  • Odessa, Ukraine
  • comments

This is a real story, and it's not only about Google. I'm getting emails from recruiters at Amazon, Facebook, and smaller Silicon Valley startups. They find me somehow, most likely through this blog, my books, or my GitHub account. They always start with "We're so impressed by your profile" and finish with "Let's schedule an interview." I always reply with the same text, and they always disappear, only to come back in a few months under a different name. Let me explain my reasons; maybe you will do the same and we can change this situation in the industry.

The Deer Hunter (1978) by Michael Cimino
The Deer Hunter (1978) by Michael Cimino

Disclaimer: I do realize that these are multi-billion-dollar companies, the best in the industry, and I'm nothing compared to them. I do realize that their recruiters don't care about my answers---they simply click "delete" and move on. I also realize that they will never see this post, and this article probably won't change anything. However, I have to write it.

This is what I'm sending back to them:

Thanks for your email. I'm very interested indeed. I have nothing against an interview. However, there is one condition: I have to be interviewed by the person I will be working for. By my future direct manager.

The recruiter who gets this reply never gets back to me.

Why do I send this?

Well, because I learned my lesson two years ago, when Amazon tried to recruit me. I got an email from the company that said they were so impressed by my profile and couldn't wait to start working with me. They needed me, nobody else. I was naive, and the message did flatter me.

We scheduled an interview in the head office in Seattle. They paid for my ticket to fly there (from San Francisco) and a night in a 5-star hotel. I was impressed. They definitely were interested. So was I.

What happened at the interview was, most probably, very close to what Max Howell experienced with Google: some programmers who didn't know a thing about my profile asked me to invent some algorithms on a white board for almost four hours. Did I manage? I don't think so. Did they make me an offer? No.

What did I learn?

That it was a waste of time. For both sides.

Their bureaucratic machine is designed to process hundreds of candidates a month. In order to fish and attract them, there is an army of monkeys recruiters sending warm emails to people like me. They have to screen candidates somehow, and they are too lazy to make this process effective and creative. They just send them through random programmers who are supposed to ask as complex questions as possible.

I'm not saying that people who pass their tests are not good programmers. I'm also not saying that I'm a good programmer---let's face it, I didn't pass the test. I do believe this filtering system is rather good. My point is that it contradicts the original email I got from the recruiter.

If she would have started her email with "We're looking for an algorithm expert," we would never have gotten any further and would not have wasted our time. Clearly, I'm not an expert in algorithms. There is no point in giving me binary-tree-traversing questions; I don't know those answers and will never be interested in learning them. I'm trying to be an expert in something else, like object-oriented design, for example.

There was a clear mismatch between my profile and the expectations of the interviewers. I don't blame them, and I don't blame her. They all were just slaves employees. I blame myself for not setting this all straight at the very beginning.

I should have told her that I didn't want to be interviewed by some programmers, because I would most certainly fail. There was no need to try. I wanted to be interviewed by the person who really needed me: my future boss. That person will understand my profile and won't ask pointless questions about algorithms, simply because he or she will know what my duties will be and what kind of problems I will be capable of solving, if they hired me.

Unfortunately, as I keep observing from two years of bouncing such emails back to recruiters, they can't change anything. They have to provide formal and standard screening for everybody, beginning with those same warm and flattering initial promises.

I'm sorry, recruiters, no more standard interviews for me.

© Yegor Bugayenko 2014–2018

StackOverflow Is Your Mandatory Tool

QR code

StackOverflow Is Your Mandatory Tool

  • Odessa, Ukraine
  • comments

I've said before that your StackOverflow reputation is very important to us when we make a decision on how much we should pay a software developer. However, there were many complaints about this metric. Take, for example, the ones here and here. In a nutshell, so many of you disagreed and said that the number of StackOverflow up-votes was nothing more than a measurement of the amount of time someone spent answering stupid questions asked by clueless programmers. Let me disagree and explain why your activity on this platform is so important to your career.

Les kidnappeurs (1998) by Graham Guit
Les kidnappeurs (1998) by Graham Guit

Basically, your StackOverflow profile demonstrates five skills you either have or don't. They may not be as important to an office slave worker, but if you're going to work remotely, they are crucial.

How to Search. The StackExchange knowledge base is huge and contains answers to almost any software question you may ask. You have to know how to search it, and not only via Google. You have to be familiar with the platform and its key features, and you can't learn that without being an active user. When your reputation is high, it's a clear indicator to me, your potential employer, that you're aware of how to find the right information in this knowledge base.

How to Ask. Asking a friend near the coffee machine is one thing. Asking a community of 6+ million developers is a totally different thing. You have to learn how to explain your problem, how to formulate the question, how to label it and title it. Try it for the first time and you will see that it's not easy at all; your questions will sound immature, silly, and ambiguous, and they will end with "Best regards" (something you shouldn't do at SO). And, of course, they will get zero up-votes. Later, when you improve, you will be surprised to see that more and more of them get up-votes, and your reputation will grow. This will be the indicator that your "question asking" skill is growing up. For me, your potential employer, it's a very important skill.

How to Answer. Initially, you will be afraid to answer. Then, most of your answers will be down-voted. Then, some of them will be accepted as best answers. Eventually, some of them will start getting up-votes. Until that happens, you will go through a lot of frustration and negative emotions. You will learn how to make your answers helpful---not just to your friends, because they don't want to offend you by saying that you have no idea what you're talking about, but to strangers, who care more about the information you're able to deliver than they care about you personally. That's a skill you can't buy; you have to earn it. And it's crucial in a distributed team.

How to Deal With Morons. You know what to do with them in the office, but on the Internet, they are much more aggressive and offensive. And there are many of them. You need to learn and practice a lot before you become competent enough to fetch information out of that programming community without pulling your hair out and screaming at the monitor. StackOverflow will help you a lot, both through questions you will ask and answers you will try to give. And you can't learn that in the office dealing with your friends only.

How to Deal With Smart-asses. Some people there are very smart and knowledgeable, and they will not always be polite when your questions or mistakes using the platform border on being too annoying. Again, your office friends won't teach you how to deal with those gurus so you can tap their knowledge; you have to be actively involved in StackOverflow discussions. This skill is very important for distributed programming, where you have to solve most of the problems on your own.

To summarize, StackOverflow is a must-have instrument for any modern software developer, no matter what your programming language, your age, your project, or your professional level are. It's like an IDE and unit tests---you just use them in order to develop faster. Some people are still using vim or emacs and writing no tests, but you don't want to be like them.

StackOverflow is not just a website where you may have an account if you feel like it. It's a mandatory instrument you have to use if you want me, your potential employer, to value you as a serious engineer. And if you use this instrument on a daily basis, your reputation will inevitably reach high levels.

badge

By the way, this is my StackExchange profile. I've earned the majority of my reputation a few years ago, so now I'm mostly getting up-votes for the answers and questions I've posted earlier. However, I keep using StackOverflow as I code, every day.

© Yegor Bugayenko 2014–2018

Each Private Static Method Is a Candidate for a New Class

QR code

Each Private Static Method Is a Candidate for a New Class

  • Kharkiv, Ukraine
  • comments

Do you have private static methods that help you break your algorithms down into smaller parts? I do. Every time I write a new method, I realize that it can be a new class instead. Of course, I don't make classes out of all of them, but that has to be the goal. Private static methods are not reusable, while classes are---that is the main difference between them, and it is crucial.

The Master (2012) by Paul Thomas Anderson
The Master (2012) by Paul Thomas Anderson

Here is an example of a simple class:

class Token {
  private String key;
  private String secret;
  String encoded() {
    return "key="
      + URLEncoder.encode(key, "UTF-8")
      + "&secret="
      + URLEncoder.encode(secret, "UTF-8");
  }
}

There is an obvious code duplication, right? The easiest way to resolve it is to introduce a private static method:

class Token {
  private String key;
  private String secret;
  String encoded() {
    return "key="
      + Token.encoded(key)
      + "&secret="
      + Token.encoded(secret);
  }
  private static String encoded(String text) {
    return URLEncoder.encode(text, "UTF-8");
  }
}

Looks much better now. But what will happen if we have another class that needs the exact same functionality? We will have to copy and paste this private static method encoded() into it, right?

A better alternative would be to introduce a new class Encoded that implements the functionality we want to share:

class Encoded {
  private final String raw;
  @Override
  public String toString() {
    return URLEncoder.encode(this.raw, "UTF-8");
  }
}

And then:

class Token {
  private String key;
  private String secret;
  String encoded() {
    return "key="
      + new Encoded(key)
      + "&secret="
      + new Encoded(secret);
  }
}

Now this functionality is 1) reusable, and 2) testable. We can easily use this class Encoded in many other places, and we can create a unit test for it. We were not able to do that with the private static method before.

See the point? The rule of thumb I've already figured for myself is that each private static method is a perfect candidate for a new class. That's why we don't have them at all in EO.

By the way, public static methods are a different story. They are also evil, but for different reasons.

P.S. Now I think that the reasons in this article are applicable to all private methods, not only static ones.

© Yegor Bugayenko 2014–2018

Decorating Envelopes

QR code

Decorating Envelopes

  • Lviv, Ukraine
  • comments

Sometimes Very often I need a class that implements an interface by making an instance of another class. Sound weird? Let me show you an example. There are many classes of that kind in the Takes Framework, and they all are named like *Wrap. It's a convenient design concept that, unfortunately, looks rather verbose in Java. It would be great to have something shorter, like in EO for example.

North by Northwest (1959) by Alfred Hitchcock
North by Northwest (1959) by Alfred Hitchcock

Take a look at RsHtml from Takes Framework. Its design looks like this (a simplified version with only one primary constructor):

class RsHtml extends RsWrap {
  RsHtml(final String text) {
    super(
      new RsWithType(
        new RsWithStatus(text, 200),
        "text/html"
      )
    );
  }
}

Now, let's take a look at that RsWrap it extends:

public class RsWrap implements Response {
  private final Response origin;
  public RsWrap(final Response res) {
    this.origin = res;
  }
  @Override
  public final Iterable<String> head() {
    return this.origin.head();
  }
  @Override
  public final InputStream body() {
    return this.origin.body();
  }
}

As you see, this "decorator" doesn't do anything except "just decorating." It encapsulates another Response and passes through all method calls.

If it's not clear yet, I'll explain the purpose of RsHtml. Let's say you have text and you want to create a Response:

String text = // you have it already
Response response = new RsWithType(
  new RsWithStatus(text, HttpURLConnection.HTTP_OK),
  "text/html"
);

Instead of doing this composition of decorators over and over again in many places, you use RsHtml:

String text = // you have it already
Response response = new RsHtml(text);

It is very convenient, but that RsWrap is very verbose. There are too many lines that don't do anything special; they just forward all method calls to the encapsulated Response.

How about we introduce a new concept, "decorators," with a new keyword, decorates:

class RsHtml decorates Response {
  RsHtml(final String text) {
    this(
      new RsWithType(
        new RsWithStatus(text, 200),
        "text/html"
      )
    )
  }
}

Then, in order to create an object, we just call:

Response response = new RsHtml(text);

We don't have any new methods in the decorators, just constructors. The only purpose for these guys is to create other objects and encapsulate them. They are not really full-purpose objects. They only help us create other objects.

That's why I would call them "decorating envelopes."

This idea may look very similar to the Factory design pattern, but it doesn't have static methods, which we are trying to avoid in object-oriented programming.

© Yegor Bugayenko 2014–2018

16 Don'ts of Career Growth

QR code

16 Don'ts of Career Growth

  • Odessa, Ukraine
  • comments

I get questions like this all the time: How does one become a senior software developer or an architect? How does one grow from a junior just starting to write Java code to the leader of a software team that is driving a BMW and making $150K+ per year? What are the exact steps that won't waste time and will get you there faster? Let me share what I think might be helpful.

The Grand Budapest Hotel (2014) by Wes Anderson
The Grand Budapest Hotel (2014) by Wes Anderson

Before writing this, I Googled a bit and found a lot of interesting suggestions, like to be helpful, make friends, be language agnostic, code a lot, try to prove your bosses wrong, avoid conflicts, exercise, etc. Some of them are good, while others are very wrong, but most of them are just too far away from the main point.

I want to share what I believe will look more or less like explicit instructions of what to do tomorrow to become a $100-per-hour software architect in a few years. Well, they worked and keep working for me.

Don't Be Loyal. The company you are working for at the moment is just a training ground, nothing else. Don't invest an extra minute of your time into it. Be selfish; think only about yourself and your personal skills, knowledge, and experience. They pay you to be dedicated and loyal? Well, that's their fault. Use them to learn new technologies, experiment with new ideas, train and educate yourself, get new certificates, meet new people, etc. They must work for you, not the other way around.

Don't Work. Make sure programming is your hobby, not your job. Everything else must be secondary, including your family, friends, and WoW. Software engineering is your family, your passion, your friend, and your life. Without that attitude, you will always be a slave to those who think like that. You must not work; you must have fun in front of the laptop. More fun than you're having anywhere else. Never do anything that is not fun. If you notice you're writing some code because you "have to" instead of because "you want to," stop immediately. Something is going wrong and you're shooting yourself in the foot; your career is in trouble.

Don't Make Friends. I'm talking about professional relationships in the office, within your projects, at the company you're working for. Remember that 99 percent of people will not become experts. They will remain who they are---regular programmers with no passion or ambition. What's really bad for you is that they will want you to stay with them. Nobody will enjoy seeing your growth, and your closest friends will become your enemies. Not explicitly, but subconsciously they will do everything they can to prevent you from getting better and leaving them. And you will have to leave them if you grow up. To avoid all that, stay professional and don't make friends at work.

Don't Be Helpful. There are more than 10 million programmers in the world. They all need help. Why do you need to help that dude sitting next to you in the office? You won't save the world by helping people around you---forget that religious nonsense. If you really want to do good for the software industry, focus on bigger things: make an open source product, write a book, or improve the documentation of the project you are working on. By helping people around you and solving their problems, you just cripple them, nothing else.

Don't Ask for Help. Expect the same attitude from programmers around you. Again, the same argument applies: There are more than 6 million accounts registered on the StackExchange platform; if you need help, ask them. Don't ask your friends or colleagues. Train yourself to get help from public sources or from your project documentation. By asking people around you, you're making your life easier in the short term only. In the long run, you will lack that important skill of knowing how to find information. You will become a hostage to those friends who help you. Also, don't learn from people around you; learn from books, StackOverflow, and open source software.

Don't Waste Time. This is probably the most important advice, which I have to give myself first of all---unfortunately, I waste a lot of time. Any growth is always about saying "No." You must be prepared to say it to your friends, your family, your habits, your wishes, your projects, colleagues, classes, methods, and lines of code. Stop the projects that are taking time and giving nothing back. Don't call back those whom you don't need. Yes, they need you, but you don't need them. This may sound harsh and selfish, but that's the only way to get where you want to be. Time is your main resource; be very greedy.

Don't Skimp on Growth. You must invest into yourself. First of all, you have to buy books. Don't steal them, even though you can. Buy them, spending your own money. You will take them way more serious. You will respect yourself for owning the library. You will feel that software engineering is forever with you; it's not temporary, it's not just a job, it's your life. Two books per month is your absolute minimum. Second, pay for certificates for the same reasons. Third, purchase software; don't steal it. Finally, don't be cheap on your laptop. It is much more important than your car or a birthday gift for your spouse. Your laptop is your instrument; it must be good and expensive made by Apple. You must go "all in" if you want to win.

Don't Work Full-Time. As much as possible, try to stay away from full-time, 9-to-5 jobs---they pause your professional growth. Permanent or long-term employment gives you a stable income, a comfortable office environment, a predictable set of technical problems to solve, and the ability to become an expert over a small territory. At the same time, it takes away fear. That's right, fear. You are not afraid anymore, and that's why you stop growing. To grow and grow fast, you must always be challenged by new tasks, new teams, new projects, and new job interviews. You must always prove that you are worth something. Ideally, you must work on two to three projects part-time and change them every 6 to 12 months.

Don't Be Cheap. Forget the stories that teach "money is not everything, and an interesting project is much more important"---they are for losers. Money is everything. An interesting project will be properly funded. If it's not funded, the market doesn't need it. What are you doing there then? The only answer is that you're not as good as others; that's how they managed to buy you. My advice is to never pay attention to those cheap stories; demand cash, up front, as much as possible.

Don't Be Skeptical About Certifications. Many programmers think certifications are not important now because they don't really validate anything and are issued simply for money by big companies. Don't think like that. Certifications help you formalize your knowledge, put borders around it, and remove gaps. And they demonstrate to most of your potential employers that you're truly serious about software engineering.

Don't Ignore Management. Being a good programmer is not the same as being a good architect or a team leader. To move higher in that hierarchy, you must understand project management. And it's not just being nice to people and wearing a suit. It's a science, with a lot of rules, principles, methods, and best practices. You must study them and become very good at them. Just as good as you are in Java or C++. Start with PMBOK and earn your PMP certification.

Don't Underestimate English. Most of my readers are not native English speakers, just like myself. I'm addressing this paragraph to you: You must improve your speaking and writing skills; it's very important. You will never become an expensive software architect if you can't speak and write well. And it can't be Russian in English words. It must be as the English people talk in San Francisco, not in Moscow. The best advice to learn it: Watch English movies with subtitles. You must speak like Matt Damon or Al Pacino, but not like Mutko.

Don't Ignore Open Source. You must be active in the open source community. It's a must. You either have your own open source project or you actively contribute to an existing one. Either way, it's crucial. Working in a closed office environment is one thing, while writing code that is visible to the entire world is a totally different thing. Most programmers are simply afraid of that, and they make up many excuses for why they are not there. Don't be one of them. Yes, it's difficult, it's stressful, it will consume a lot of your private time, and nobody will pay you for it. Do it anyway---this is the fastest way to grow. Moreover, I would recommend you try to open as much source code as possible, even if you write it for private and commercial projects. Some companies won't be against that.

Don't Be Invisible. Make sure you have Facebook, Twitter, LinkedIn, and Instagram accounts, along with a blog. You must be present on the Internet. You're a serious software architect? I should be able to Google your name and find a lot of professional links, not just your Tinder profile. And they will Google your name; don't ever doubt that. My book 256 Bloghacks may help you understand how to do it right.

Don't Stay Home. Attend seminars, meetups, and software conferences. At least once a month, you must go somewhere where other programmers are hanging out. You don't need to be super active and make a lot of friends---just be there and watch. Eventually you will realize that it's time to become a speaker. Remember that it doesn't really matter how much your coworkers respect you. What matters is what the market thinks about you.

Don't Forget to Relax. Nobody likes those smelly dorks who only get one haircut per year. They will hire you and respect you as a coder, but they will never take you seriously as a candidate for a role with a lot of money responsibility. You will always look like a mentally unstable person. Instead, you must look "like business," even though you are a geek. That's why it's very important to pay attention to how you spend your free time---how you relax. Playing GTA 'til 3 a.m. is not what successful and happy software architects do. Instead, here is your short list of activities: sports, tourism, and night clubs. Be a normal person---that's the point.

Did I miss anything important?

© Yegor Bugayenko 2014–2018

Synchronized Decorators to Replace Thread-Safe Classes

QR code

Synchronized Decorators to Replace Thread-Safe Classes

  • Odessa, Ukraine
  • comments

You know what thread safety is, right? If not, there is a simple example below. All classes must be thread-safe, right? Not really. Some of them have to be thread-safe? Wrong again. I think none of them have to be thread-safe, while all of them have to provide synchronized decorators.

Aladdin (1992) by Ron Clements and John Musker
Aladdin (1992) by Ron Clements and John Musker

Let's start with an example (it's mutable, by the way):

class Position {
  private int number = 0;
  @Override
  public void increment() {
    int before = this.number;
    int after = before + 1;
    this.number = after;
  }
}

What do you think---is it thread-safe? This term refers to whether an object of this class will operate without mistakes when used by multiple threads at the same time. Let's say we have two threads working with the same object, position, and calling its method increment() at exactly the same moment in time.

We expect the number integer to be equal to 2 when both threads finish up, because each of them will increment it once, right? However, most likely this won't happen.

Let's see what will happen. In both threads, before will equal 0 when they start. Then after will be set to 1. Then, both threads will do this.number = 1 and we will end up with 1 in number instead of the expected 2. See the problem? Classes with such a flaw in their design are not thread-safe.

The simplest and most obvious solution is to make our method synchronized. That will guarantee that no matter how many threads call it at the same time, they will all go sequentially, not in parallel: one thread after another. Of course, it will take longer, but it will prevent that mistake from happening:

class Position {
  private int number = 0;
  @Override
  public synchronized void increment() {
    int before = this.number;
    int after = before + 1;
    this.number = after;
  }
}

A class that guarantees it won't break no matter how many threads are working with it is called thread-safe.

Now the question is: Do we have to make all classes thread-safe or only some of them? It would seem to be better to have all classes error-free, right? Why would anyone want an object that may break at some point? Well, not exactly. Remember, there is a performance concern involved; we don't often have multiple threads, and we always want our objects to run as fast as possible. A between-threads synchronization mechanism will definitely slow us down.

I think the right approach is to have two classes. The first one is not thread-safe, while the other one is a synchronized decorator, which would look like this:

class SyncPosition implements Position {
  private final Position origin;
  SyncPosition(Position pos) {
    this.origin = pos;
  }
  @Override
  public synchronized void increment() {
    this.origin.increment();
  }
}

Now, when we need our position object to be thread-safe, we decorate it with SyncPosition:

Position position = new SyncPosition(
  new SimplePosition()
);

When we need a plain simple position, without any thread safety, we do this:

Position position = new SimplePosition();

Making class functionality both rich and thread-safe is, in my opinion, a violation of that famous single responsibility principle.

By the way, this problem is very close to the one of defensive programming and validators.

© Yegor Bugayenko 2014–2018

How to Teach a Customer

QR code

How to Teach a Customer

  • Dnipro, Ukraine
  • comments

In outsourcing, very often a customer is an idiot doesn't really know what he needs---not only in terms of functionality, but also on a technical level. What makes the situation even worse is that the customer very often always thinks he knows and understands enough. The question is how do you teach a customer? How do you train, educate, and help him? You don't!

The Firm (1993) by Sydney Pollack
The Firm (1993) by Sydney Pollack

The temptation will be huge, though. You will think that the customer is your friend. You will want to help him. You will feel very motivated to make the product better. Moreover, you have the needed knowledge, so why not share it, right?

Wrong. Very wrong. On so many levels.

Most of all because the customer is not your friend. Not a partner, not a co-worker, not a colleague, and not a teammate. The customer is a project stakeholder, just like you, but his needs are completely opposite of yours, and he is very aware of that.

He wants the project to finish as soon as possible and to take as little money out of his pocket as possible. You want exactly the opposite. You both work for the project, but from two very different angles.

Very soon, when the project fails (and it will fail, just like 94 percent of all software projects, according to Chaos Report), the customer will find someone to blame. Needless to say, he won't blame himself; he will blame you.

According to the same report, only 7 percent of failures are caused by technical incompetence. Thus, most likely your project will fail because of an incorrect understanding of requirements, poor planning, misalignment of management objectives, etc. Do you really want to be blamed for the things you're not an expert at? Let the customer fail; it's his project, his life, and his money.

Do your technical job right and stay away from everything else. This is what true professionalism is about---focus on the things you can do best of all and better than most others.

However, if you see that he is doing something wrong and definitely requires help, recommend that he hire a consultant. There are many people in the market who can help him with requirements, UX, business analysis, marketing planning, branding, etc. In most cases, customers just don't know that these people exist, or they believe these services are just a waste of money.

Try to convince them that it's not true, but don't become an adviser in something you don't understand. Your job is coding, so focus only on that.

© Yegor Bugayenko 2014–2018

How Much Do You Love Conflict?

QR code

How Much Do You Love Conflict?

  • Kiev, Ukraine
  • comments

Conflict is what progress is made of. A professional and well-managed team loves conflicts and creates them on a daily basis. A professional project manager provokes conflicts and makes sure none of them end in a consensus. Does that sound strange? It's not sarcasm. Read on.

Being Flynn (2012) by Paul Weitz
Being Flynn (2012) by Paul Weitz

Have you ever heard the term "win-win?" Do you know what it means? My guess is that most of my readers aren't exactly sure what this is about, even though it's used very often. Let me explain. In any conflict, there are three possible outcomes: lose-lose, win-lose, and win-win. The first one is the worst, and the last one is the best. Here is an example.

Say your wife wants to watch a movie, and you want to watch a baseball game. That's a conflict. It starts with a confrontation of positions. Your position is, "I want this game," while her position is, "I want this movie."

The easiest way is to hold to these positions no matter what, but very soon your conflict will turn into a fight and maybe eventually a divorce.

Project management offers a few conflict resolution techniques that can help you and your wife get out of this confrontation without asking the police for help. No matter which technique you use, the result will be either lose-lose, win-lose, or win-win.

Lose-Lose

Compromise is the worst outcome, and it's known as lose-lose. For example, you both agree on watching the news---that's a compromise. Neither of you will get what you wanted, a movie or a baseball game. You both lose. Who gains in this case? You neighbors and the police, since there will be no fight. Will the problem really be solved? No. You both will hate each other even more, because neither of your desires were satisfied. The divorce is still coming closer.

The same happens in software team conflicts---if and when we resolve them through compromises, everybody suffers except those management and HR monkeys who only care about a peaceful office environment. They don't want to see us fighting over a piece of damn Java code. Moreover, they don't really understand what the fight is about. They know nothing about that Singleton design pattern and can't understand why these guys are almost ready to kill each other just because one of them says it's a pattern and the other one calls it an anti-pattern, insists that the project must not use it, and threatens everybody with an immediate discharge if they don't listen.

Such a fight freaks everybody out. Everybody who sees positions and doesn't see interests, that is. Remember, the position is, "I want to see the movie" and "I want to use a singleton." The only thing a confrontation of positions can produce is a fight, and the only solution is a compromise: "You guys need a good team-building party so you become friends and lose balls the desire to fight." That's what those monkeys build: teams. They believe that when the team is "strong," there will be no fights, no conflicts, no arguments, no design patterns, no anti-patterns, and ... no senior developers. There will be just one permanent compromise over everything.

In a family, compromises lead to divorces. In a software team, the best talent just leaves. They simply don't want to see their interests being disrespected all the time, just for the sake of avoiding fights. Stay away from compromises; they are pure evil for both a family and a team.

Win-Lose

The second option, which is a bit better than a compromise, is to use force: "I'm a man, so you do what I say; we will watch the game!" or "I feel sick; let me watch a movie." In either case, one of you will get what he or she initially wanted. Even though this approach looks less "democratic," it's way more effective, mostly because it doesn't involve any third parties: There is no interest of the police or neighbors involved, and the family resolves the conflict internally and naturally.

Both of you understand exactly why you're watching that game now: because the male part of the family is physically stronger. Even though it may sound super annoying to you, my Californian readers, such a family would be way farther from a divorce than the one that used to make compromises, especially if the winning party is not always the same.

If your software team has a moronic experienced architect, you will most likely work in this conflict resolution model. He or she will make decisions, and you will have to go along. I wrote about such an architect here and here. I said there that an architect must be a dictator, making decisions and taking full responsibility for them.

If the architect is super smart, respected by everybody, and immortal, this force-based conflict resolution technique will work perfectly. The project will move forward fast, because everybody will work instead of think. There will be only one person who thinks---the architect.

The main drawback of this win-lose approach is the "lose" part: Someone is always losing. And it's not about an offense, even though that's also important. It's about us missing some valuable information. You will never know why your wife wanted to watch that movie or why that junior developer was suggesting you use NoSQL instead of SQL. You will just force them both to shut up and follow your will. While they did, you still "lost" something. So basically it's the team that is losing something, not just your wife or that junior developer.

Win-Win

The most difficult and yet most effective way to resolve a conflict is to collaborate in order to discover the interests of all parties and find a solution that satisfies them all. You start by asking, "Why do you want to watch that movie?" to learn what exactly is behind that aggressive "I want the movie" position. Again, there is a huge difference between a position and an interest.

You may hear this back: "I'm just tired." So the real interest is to relax, not to watch the movie. The movie was just one of the options to get rest. Now, knowing her real interest, you may come up with, "How about I watch the game and give you a massage at the same time?" This way, the divorce may never happen.

Thus, the first important step is to help everybody abandon their positions and honestly expose their interests. When that's done, we can all start to work not against each other but against the problem: With what solution will all our interests be satisfied at the same time?

We will ask that junior developer: "Why do you think we need NoSQL?" It's very likely that we will hear something like, "I just want to learn this new concept." This is his real interest---to learn something new while working on this project. Maybe we can offer him some other technology to learn? Maybe we can move him to another project where NoSQL is used? There are many options. But the first step is to understand what he really wants. Not what position he took, but what was his real motivation for it.

A truly professional software team is full of conflicts, which are always being resolved by collaboration. The team is not afraid of conflicts. Instead, it welcomes them, because they help reveal the real interests of all parties involved and make a lot of information visible and available.

Truly professional team players always try to provoke conflicts in order to gain an opportunity to resolve them through collaboration, thereby exiting through the win-win door. That's how the team grows---not by hiding conflicts and making compromises, but by provoking them, making different interests visible, and finding the most optimal solutions.

Be aware, though, that this is way more difficult than organizing team-building parties.

© Yegor Bugayenko 2014–2018

Can Objects Be Friends?

QR code

Can Objects Be Friends?

  • Moscow, Russia
  • comments

As discussed before, proper encapsulation leads to a complete absence of "naked data." However, the question remains: How can objects interact if they can't exchange data? Eventually we have to expose some data in order to let other objects use it, right? Yes, that's true. However, I guess I have a solution that keeps encapsulation in place while allowing objects to interact.

Raging Bull (1980) by Martin Scorsese
Raging Bull (1980) by Martin Scorsese

Say that this is our object:

class Temperature {
  private int t;
  public String toString() {
    return String.format("%d C", this.t);
  }
}

It represents a temperature. The only behavior it exposes is printing the temperature in Celsius. We don't want to expose t, because that will lead to the "naked data" problem. We want to keep t secret, and that's a good desire.

Now, we want to have the ability to print temperature in Fahrenheit. The most obvious approach would be to introduce another method, toFahrenheitString(), or add a Boolean flag to the object, which will change the behavior of method toString(), right? Either one of these solutions is better than adding a method getT(), but neither one is perfect.

What if we create this decorator:

class TempFahrenheit implements Temperature {
  private TempCelsius origin;
  public String toString() {
    return String.format(
      "%d F", this.origin.t * 1.8 + 32
    );
  }
}

It should work just great:

Temperature t = new TempFahrenheit(
  new TempCelsius(35)
);

The only problem is that it won't compile in Java, because class TempFahrenheit is not allowed to access private t in class TempCelsius. And if we make t public, everybody will be able to read it directly, and we'll have that "naked data" problem---a severe violation of encapsulation.

However, if we allow that access only to one class, everything will be fine. Something like this (won't work in Java; it's just a concept):

class TempCelsius {
  trust TempFahrenheit; // here!
  private int t;
  public String toString() {
    return String.format("%d C", this.t);
  }
}

Since this trust keyword is placed into the class that allows access, we won't have the "naked data" problem---we will always know exactly which objects posses knowledge about t. When we change something about t, we know exactly where to update the code.

What do you think?

P.S. After discussing this idea below in comments I started to think that we don't need that trust keyword at all. Instead, we should just give all decorators access to all private attributes of an object.

© Yegor Bugayenko 2014–2018

MVC vs. OOP

QR code

MVC vs. OOP

  • Kiev, Ukraine
  • comments

Model-View-Controller (MVC) is an architectural pattern we all are well aware of. It's a de-facto standard for almost all UI and Web frameworks. It is convenient and easy to use. It is simple and effective. It is a great concept ... for a procedural programmer. If your software is object-oriented, you should dislike MVC as much as I do. Here is why.

Hot Shots! (1991) by Jim Abrahams
Hot Shots! (1991) by Jim Abrahams

This is how MVC architecture looks:

PlantUML SVG diagram

Controller is in charge, taking care of the data received from Model and injecting it into View---and this is exactly the problem. The data escapes the Model and becomes "naked," which is a big problem, as we agreed earlier. OOP is all about encapsulation---data hiding.

MVC architecture does exactly the opposite by exposing the data and hiding behavior. The controller deals with the data directly, making decisions about its purpose and properties, while the objects, which are supposed to know everything about the data and hide it, remain anemic. That is exactly the principle any procedural architecture is built upon; the code is in charge of the data. Take this C++ code, for example:

void print_speed() { // controller
  int s = load_from_engine(); // model
  printf("The speed is %d mph", s); // view
}

The function print_speed() is the controller. It gets the data s from the model load_from_engine() and renders it via the view printf(). Only the controller knows that the data is in miles per hour. The engine returns int without any properties. The controller simply assumed that that data is in mph. If we want to create a similar controller somewhere else, we will have to make a similar assumption again and again. That's what the "naked data" problem is about, and it leads to serious maintainability issues.

This is an object-oriented alternative to the code above (pseudo-C++):

printf(
  new PrintedSpeed( // view
    new FormattedSpeed( // controller
      new SpeedFromEngine() // model
    )
  )
);

Here, SpeedFromEngine.speed() returns speed in mph, as an integer; FormattedSpeed.speed() returns "%d mph"; and finally, PrintedSpeed.to_str() returns the full text of the message. We can call them "model, view, and controller," but in reality they are just objects decorating each other. It's still the same entity---the speed. But it gets more complex and intelligent by being decorated.

We don't tear the concept of speed apart. The speed is the speed, no matter who works with it and where it is presented. It just gets new behavior from decorators. It grows, but never falls apart.

To summarize, Controller is a pure procedural component in the MVC trio, which turns Model into a passive data holder and View into a passive data renderer. The controller, the holder, the renderer ... Is it really OOP?

© Yegor Bugayenko 2014–2018

How to Pay Programmers Less

QR code

How to Pay Programmers Less

  • Tallinn, Estonia
  • comments

To create software, you need programmers. Unfortunately. They are expensive, lazy, and almost impossible to control. The software they create either works or doesn't, but you still have to pay them, every month. Of course, it's always better to pay less. However, sometimes they may figure out they are being underpaid and quit. How do you prevent that? Unfortunately, we can't use violence any more, but there are some other mechanisms. Let me share.

Ben-Hur (1959) by William Wyler
Ben-Hur (1959) by William Wyler

Keep salaries secret. It's obvious: Don't let them discuss salaries. They must keep this information secret. Warn them or even sign NDAs prohibiting any talks about wages, bonuses, compensation plans, etc. They must feel that this information is toxic and never even talk to each other about salaries. If they don't know how much their coworkers are getting, they won't raise salary questions for much longer.

Give raises randomly. There should be no system behind your salary upgrades or firing decisions. You give them raises when you feel like it, not when they are being more productive or effective. Try to make your decisions unpredictable. Unpredictability creates fear, and this is exactly what you need. They will be afraid of you and will not complain about being underpaid for a long time.

No conferences. Don't allow them to attend meetups or conferences. They may meet recruiters there and find out that their salaries are not fair enough. Promote the idea that conferences are just a waste of time. It's better to organize events in the office. They always have to stay together, never free to meet programmers from other companies. The less they know, the safer you are.

No work from home. The office must be their second home. Well, preferably the first one. They must go there every day, have a desk there, a computer, a chair, and a stapler. They will be emotionally attached to the place and it will be very difficult to leave, no matter how underpaid they will be. Never allow them to work remotely---they may start thinking about a new home with a bigger salary.

Spy on them. Make sure they all use your email server, computers, servers, and even mobile phones. Install software that tracks all their messages. Ideally, you should have a security department watching all of them and regularly informing you about abnormal or suspicious behavior (office cameras will help too). Any contact with other companies should be considered suspicious. Employees must know you're spying on them. Extra fear is always helpful.

Make a deal with competitors. Contact your major competitors in the region and agree to not head-hunt their programmers if they don't touch yours. If they reject this deal, try to recruit a few of their key engineers. Just offer to double their salaries. You won't really hire them, of course, but this move will definitely shake your local market, and competitors will be afraid of you. They will agree to never touch your slaves developers.

Promote corporate values. Brainwash them regularly by communicating how great your company is, how big its mission is, and how important their contribution is. The numbers on their paychecks will look way less important compared to the multi-billion-dollar market the team is trying to dominate. They will sacrifice for a while. For quite a long time, this trick will work.

Build a family. Corporate parties, Friday beer, team building events, bowling, birthdays, lunches and team nights---use these tools to create a feeling that your company is their family. Money is not really what good people talk about in a family, right? Asking for a raise will sound like a betrayal of family values---they will be afraid to do that.

Stress them. They must not feel relaxed, it's not in your favor. Make sure they have tight deadlines, complex problems to solve, and enough guilt on their shoulders. They won't ask for a raise, constantly feeling guilty for letting you down with project goals. Try to make them responsible for failures as much as possible.

Make promises. You don't need to keep them, but you have to make them: promise to raise their salaries soon, or by the time you raise investments, or by the time a big contract is signed, or when "the time is right." It is important to always make your promises dependable on events that are out of your control---your hands must always be clean.

Buy them cushion chairs and tennis tables. Spend just a little on all those funny office things, and they will pay you back big time, through the ability to underpay your programmers. A fancy and professional coffee machine will cost you $1,000 and make it possible to save $200 to $300 on each programmer monthly. Do the math. Make yourself a rule that instead of giving someone a raise, it's always better to buy a new PlayStation for the office. Also, let them bring their spouses pets to the office---they will work stay longer for less money.

Give them sound titles. Call them Vice Presidents, for example VP of Engineering, VP of Technology, VP of Whatever. Not a big deal for you, but very important for them. The salary will be much less valuable than the title they can put on their LinkedIn profiles. If you're running out of Vice Presidents, try Senior Architect, Lead Technical Lead, Chief Scientist, etc.

Help them survive. Most programmers are rather stupid when it comes to managing money. They simply don't know how to buy insurance, how to plan a retirement fund, or even how to pay taxes. You help them, to your own benefit, of course. They will be happy to feel safe in your hands, and won't leave you. They won't ask for a raise, either, because they will feel bad about even starting such a negotiation. You must be the "parent," and they will be the "kids." It's a good old model. It works.

Be a friend. This is the last and most powerful technique. You have to be a friend to your programmers. It's very difficult to negotiate money with a friend---they won't be able to do it easily. They will keep working for you for less money just because you're good friends. How do you become friends? Well, meet their families, invite them over for dinner at your house, give them birthday gifts---all those tricks. They will save you a lot of money.

Did I forget anything?


If you like this article, you will definitely like these very relevant posts too:

How Do You Punish Your Employees?
A sarcastic overview of different types of abusive and manipulative behavior a bad manager may expose to office employees.

How to Be a Good Office Slave
Office slavery is what most companies practice and what most office workers suffer from, often unconsciously.

Team Morale: Myths and Reality
Team morale is a key performance driver in any group, especially a software development team; however, there are many myths about it.

© Yegor Bugayenko 2014–2018

EO

QR code

EO

  • Tallinn, Estonia
  • comments

It's time to do it! We've started work on a new programming language. Its name is EO (as in Elegant Objects or in Esperanto): eolang.org. It's open source and community driven: yegor256/eo GitHub repo. It's still in very early draft form, but the direction is more or less clear: It has to be truly object-oriented, with no compromises. You're welcome to join us.

Vicky Cristina Barcelona (2008) by Woody Allen
Vicky Cristina Barcelona (2008) by Woody Allen

Why yet another language? Because there are no object-oriented languages on the market that are really object-oriented, to my knowledge. Here are the things I think do not belong in a pure object-oriented language:

  • static methods
  • classes (only types and objects)
  • implementation inheritance
  • mutability
  • NULL
  • reflection
  • constants
  • type casting
  • annotations
  • flow control (for, while, if, etc.)

And many other minor mistakes that Java and C++ are full of.

At the moment, we think that EO will compile into Java. Not into byte-code, but into .java files, later compilable to byte-code.

I really count on your contribution. Please submit your ideas as tickets and pull request to the yegor256/eo GitHub repo.

© Yegor Bugayenko 2014–2018

Encapsulation Covers Up Naked Data

QR code

Encapsulation Covers Up Naked Data

  • Moscow, Russia
  • comments

Encapsulation is the core principle of object-oriented programming that makes objects solid, cohesive, trustworthy, etc. But what exactly is encapsulation? Does it only protect against access to private attributes from outside an object? I think it's much more. Encapsulation leads to the absence of naked data on all levels and in all forms.

Borat: Cultural Learnings of America for Make Benefit Glorious Nation of Kazakhstan (2006) by Larry Charles
Borat: Cultural Learnings of America for Make Benefit Glorious Nation of Kazakhstan (2006) by Larry Charles

This is what naked data is (C code):

int t;
t = 85;
printf("The temperature is %d F", t);

Here t is the data, which is publicly accessible by the code around it. Anyone can modify it or read it.

Why is that bad? For one reason: tight and hidden coupling.

The code around t inevitably makes a lot of assumptions about the data. For example, both lines after int t decided that the temperature is in Fahrenheit. At the moment of writing, this may be true, but this assumption couples the code with the data. If tomorrow we change t to Celsius, the code won't know about this change. That's why I call this coupling hidden.

If we change the type of t from int to, say, double, the printf line won't print anything after the decimal point. Again, the coupling is there, but it's hidden. Later on, we simply won't be able to find all the places in our code where we made these or other assumptions about t.

This will seriously affect maintainability.

And this is not a solution, as you can imagine (Java now):

class Temperature {
  private int t;
  public int getT() { return this.t; }
  public void setT(int t) { this.t = t; }
}

It looks like an object, but the data is still naked. Anyone can retrieve t from the object and decide whether it's Fahrenheit or Celsius, whether it has digits after the dot or not, etc. This is not encapsulation yet!

The only way to encapsulate t is to make sure nobody can touch it either directly or by retrieving it from an object. How do we do that? Just stop exposing data and start exposing functionality. Here is how, for example:

class Temperature {
  private int t;
  public String toString() {
    return String.format("%d F", this.t);
  }
}

We don't allow anyone to retrieve t anymore. All they can do is convert temperature to text. If and when we decide to change t to Celsius, we will do it just once and in one place: in the class Temperature.

If we need other functions in the future, like math operations or conversion to Celsius, we add more methods to class Temperature. But we never let anyone touch or know about t.

This idea is close to "printers instead of getters," which we discussed earlier, though from a much wider perspective. Here I'm saying that any data elements that escape objects are naked and lead to maintainability problems.

The question is how we can work entirely without naked data, right? Eventually we have to let objects exchange data, don't we? Yes, that's true. But not entirely. I'll explain that in my next post.

© Yegor Bugayenko 2014–2018

Software Conferences to Attend

QR code

Software Conferences to Attend

  • Vilnius, Lithuania
  • comments

This is my list of software conferences that are worth attending, as a speaker and a listener; with a focus on Java and project management. I will try to update this list regularly, mostly not to forget where I have to submit my talks. Hopefully the list will help you too, to make the right choice and never miss their CFP deadlines.

Name Place When CFP
JEEConfKievMayJan@jeeconf
ØredevMalmöNovMar@oredev
JavaOneSan FranciscoSepApr@javaoneconf
DevoxxAntwerpNovJun@devoxx
JavaZoneOsloSepApr@javazone
QConSan FranciscoNovMay@qcon
JFokusStockholmFebJun@jfokus
GeekOUTTallinnJunSep@geekoutee
JPointMoscowAprDec@jugru

Did I forget anything?

© Yegor Bugayenko 2014–2018

Why I Don't Publish E-Books

QR code

Why I Don't Publish E-Books

  • Copenhagen, Denmark
  • comments

Very often readers of my books ask me why I don't publish them in digital format as e-books for Amazon Kindle, EPUB, FB2, or simply PDF. There are a few reasons. It's time to summarize them all and explain why dead trees are the only way to go if you want to read my content.

The Addams Family (1991) by Barry Sonnenfeld
The Addams Family (1991) by Barry Sonnenfeld

First of all, there is a simple technical reason:

badge

I don't know how to format them. I type all my books in LaTeX. To my knowledge, it's the best and most powerful typesetting software. If you don't use it yet, you absolutely must read The TeX Book by Donald Knuth. Even if you're not going to become a book writer or publisher, you must read the book. You will enjoy reading and will simply fall in love with TeX. The only problem with TeX is that it formats texts for a fixed page size, unlike HTML and many other digital formats. When I write my books, I know exactly the size of their pages, and everything is formatted to look perfect on paper. I simply don't know how to do the same for all digital formats. I'm sure it's possible, but I don't know how.

Second, there is an emotional reason:

I don't like digital books. Call me old school, but I don't like to read on screen. I like how books feel, how they smell, and how they become "friends." I like to make notes, bookmarks, fold pages, etc. I believe what's very important is not just the content, but the way you "feel" it. With a digital book, this emotional aspect of reading is gone; all books are the same. You don't feel a book at all, because it's just a Kindle in your hands. You may say that not everybody is like me. Well, yes, but I want the world to be the way I like it. Not the way it is.

All other reasons are derived from the fact that a digital book will inevitably be stolen and posted on torrents or somewhere else, for free download. A digital book will become a free book very quickly.

Let me tell you a funny story. I received an email a few months ago from a "Korean book publisher." The email said it was very interested in translating Elegant Objects into Korean and publishing in its local market. To start the process, the email said, the company needed my book in PDF. I replied that I was ready to send a printed copy, which was definitely enough for a translator to work on. The sender disappeared. I've checked its website and found no real evidence of previously published books. I guess it was just a scam, an attempt to get a digital copy of the book. Funny, huh?

Thus, let's just agree that a digital book means a free book at the moment of writing, in this world. And here is why I don't want my books to be free:

I want to earn. Not only because I need to pay my bills, but mostly because I want to stay motivated. I've made almost $12,000 by selling the first volume of Elegant Objects over the last 10 months. Do you think I'm motivated enough to write the second volume. Of course I am! Would I be as motivated as I am now if I would have made $500 instead? I don't think so. Most probably, you would never see any more books from me. And it's not just about dollars. It's mostly about the appreciation I feel from you. Every payment I get from Amazon tells me that I definitely deliver something valuable. With a free book, I will get no appreciation and no cash.

I want you to pay. Not only because I'm greedy, but mostly because I want you to take my books seriously. As a reader myself, I pay almost no attention to books that cost $1.99 or nothing. I understand that their authors themselves were not serious about them. Why are they cheap or free? They were so easy to write? Their authors don't believe that anyone would pay any decent money for them? Their authors are afraid of refunds? Probably a combination of all that. I strongly believe that good products must cost good money. If it's free, it's bad (or there are some hidden costs or concealed promotion of something else).

Because of all that, you get no digital books. Only printed ones.

© Yegor Bugayenko 2014–2018

Software Quality Award, 2017

QR code

Software Quality Award, 2017

  • comments
badge

This is the third year of the Software Quality Award. The prize is still the same---$4,096. The rules are still the same. Read on. Previous years are here: 2015, 2016.

The rules:

  • One person can submit only one project.

  • Submissions are accepted until the September 1, 2017.

  • I will check the commit history to make sure you're the main contributor to the project.

  • I reserve the right to reject any submission without explanation.

  • All submissions will be published on this page (including rejected ones).

  • Results will be announced October 15, 2017 on this page and by email.

  • The best project will receive $4,096.

  • Final decisions will be made by me and are not negotiable (although I may invite other people to help me make the right decision).

  • Winners that received any cash prizes in previous years can't submit again.

Each project must be:

  • Open source (in GitHub).

  • At least 10,000 lines of code.

  • At least one year old.

  • Object-oriented (that's the only thing I understand).

The best project is selected using this criteria.

What doesn't matter:

  • Popularity. Even if nobody is using your product, it is still eligible for this award. I don't care about popularity; quality is the key.

  • Programming language. I believe that any language, used correctly, can be applied to design a high-quality product.

  • Buzz and trends. Even if your project is yet another parser of command line arguments, it's still eligible for the award. I don't care about your marketing position; quality is all.

By the way, if you want to sponsor this award and increase the bonus, email me.


These 28 projects submitted so far (in order of submission):


[15 Sep 2017] I invited six people to help me review the projects. Their names are:

[15 Oct 2017] This is the summary of everything they sent me: award-2017.txt. I will pick the winner in the next few days, stay tuned!

[21 Oct 2017] My short list includes these six projects (in random order): php-ai/php-ml, vavr-io/vavr, zetaops/ulakbus, mafagafogigante/dungeon, ribtoks/xpiks, javascript-obfuscator/javascript-obfuscator. Tomorrow (hopefully) I will decide how to split $4096.

[23 Oct 2017] These are my own observations per project from the short list. I will only mention negative things, since all projects are pretty good, no need to say how good they are. I listed problems in order of importance (most critical on top).

php-ai/php-ml (9.8K LoC PHP, 29K HoC):

  • How do you release it?
  • Getters, setters and mutability in many places
  • NULL is in many places (again, I know that no method overloading in PHP)
  • -ER: Estimator, Classifier, Clusterer, Optimizer, etc.
  • code in constructors (yes, I understand that it's PHP)
  • empty lines in method bodies
  • Score: 5

vavr-io/vavr (70K LoC Java, 834K HoC):

  • How do you release it?
  • There are really big "classes", they are huge in io.vavr.collection package
  • Interface Seq has 120+ methods! What is going on?
  • Utility classes, static methods
  • Some .java files have a few Java classes. Why?
  • Could not build master: #2147
  • Score: 4

zetaops/ulakbus (25K LoC Python, 707K HoC):

  • How do you release it?
  • No CI, no test coverage, no static analysis automation?
  • See the comments from the reviewer
  • Score: 2

mafagafogigante/dungeon (14K LoC Java, 88K HoC):

  • Release automated but only for one person
  • Static methods, getters, setters, mutability
  • Commits don't link to issues and PRs
  • In-method-body comments are in many places, it's a bad practice
  • Score: 5

ribtoks/xpiks (180K+ LoC C/C++, 739K HoC):

  • No coverage, no static analysis
  • Types are rather big, with many methods
  • Util classes, helpers
  • -ERs: CommandManager, SpellCheckWorker, etc.
  • I didn't really find much documentation inside the code
  • Commits are not linked to issues/PRs
  • Score: 4

javascript-obfuscator/javascript-obfuscator (72K LoC JS/TS, 400K HoC):

  • Utils and just global functions
  • Annotation-driven injectable dependencies
  • -ERs: reader, sanitizer, emitter
  • Public attributes in classes
  • I believe many "objects" are just DTOs here
  • Interfaces are prefixed with I, it's an anti-pattern
  • Score: 4

My overall impression this year is that I'm getting much less garbage. There are fewer projects submitted, but the quality of them is much higher than in the previous two years. I'm glad to see this tendency. It means to me that I'm doing the right thing.

This time I paid more attention to the elegance of OOP and maintainability of the code base. Key factors for the maintainability were:

  • Automated releases
  • Automated static analysis
  • Automated builds (CI)
  • Automated tests
  • Disciplined commits, via issues and PRs

For the elegance of OOP, as usual, I paid attention to the absence of anti-patterns, including NULL, getters, setters, static, mutability, etc.

There are two winners this year: php-ai/php-ml and mafagafogigante/dungeon. But I don't really like the code I found in these repositories. It's obviously better than everybody else, but not perfect at all.

That's why, here is my decision: I will give just $1,024 to each winner, instead of $2,048.

Congratulations to @itcraftsmanpl for php-ml ($1,024) and to @mafagafogigante for dungeon ($1,024).

Here are your badges:

winner   winner

Put this code into GitHub README (replace ??? with your GitHub name in the URL):

<a href="http://www.yegor256.com/2016/10/23/award-2017.html">
  <img src="//www.yegor256.com/images/award/2017/winner-???.png"
  style="width:203px;height:45px;" alt='winner'/></a>

Thanks to everybody for your participation! See you next year.

© Yegor Bugayenko 2014–2018

Command, Control, and Innovate

QR code

Command, Control, and Innovate

  • Palo Alto, CA
  • comments

Command and control has worked effectively in military units across the world for thousands of years. But apparently we've just discovered that the best companies are built on different verbs, which are inspire, delegate, trust, lead, innovate, etc. The question is whether we really uncovered something new that our predecessors failed to understand for ages or something else is going on.

Andrei Rublev (1966) by Andrei Tarkovsky
Andrei Rublev (1966) by Andrei Tarkovsky

We are lazy and greedy animals. To work and produce something for someone, we need two things: motivation and punishment. The carrot and stick has been a dominating principle in management for thousands of years. The Colosseum was built not because people enjoyed building it but rather thanks to a simple rule: Good slaves ate, and bad ones were beaten to death. A primitive form of command-and-control management was most effective at that time, both in civil and military arenas.

Once slavery became illegal in the 19th century, the simple rule changed: Good workers were paid, while bad ones were fired. 150 years ago in most countries, losing a job literally meant starvation and sometimes death, so it was not really far away from beating slaves to death. Because a hundred years ago there were nearly no mechanisms for social protection, capitalists were allowed to do almost anything they wanted. A slightly more advanced but still rather primitive form of command and control was the best management paradigm.

Besides that, the armies of all time have always been built as hierarchies with very strict and deterministic definitions of responsibilities and authorities. Since the time of Sun Tzu, the external strength of any army was ensured by its internal discipline, which was always about a clear and explicit chain of commands, rewards, and punishments.

The situation started to change only recently, in the 20th century. Three trends dramatically influenced the balance of power between employers and employees, masters and slaves, managers and managees: socialism, computers, and education.

  • First of all, socialism is slowly taking over capitalism. Workers gradually obtain more rights and protections while employers lose them every year. Losing a job is not a tragedy for us anymore.

  • Second, the complexity of the tasks we perform at our workplaces is growing, mostly thanks to computers. We are not as easily replaceable as we were a few hundred years ago.

  • Third, we are getting smarter every year. Most of us know how to read and write. We learn more, faster, partially due to the Internet.

Thanks to these three major trends, it's almost impossible to apply the same primitive command-and-control management anymore: Modern workers are not the same as those who built the Colosseum in ancient Rome. We are very different, and our carrots and sticks must also be very different in order to be effective. Still, giving us carrots and sticks is absolutely necessary, because we are still lazy and greedy, just like the guys who built the Colosseum. Likewise, we need motivation and punishment in order to produce something for someone.

What about creativity and inspiration? Just like the architects of the Colosseum, we need people today to create iPads and Facebooks, but management and coordination are what really make projects happen. And command and control is the only working mechanism for coordinating humans.

However, what management is doing now is absolutely evil and unethical. They still adhere to command and control but mask it as inspire and trust. They use carrots and sticks but redefine them as appreciation and peer pressure. They lie to us that we are not animals anymore and don't need command and control, while at the same time doing exactly that.

The primary victim of this slick approach is our mental health. A thousand years ago, masters physically damaged their slaves; today they damage us mentally. Which one is worse? Where are we heading? I predict serious problems in the near future.

© Yegor Bugayenko 2014–2018

OOP Without Classes?

QR code

OOP Without Classes?

  • Palo Alto, CA
  • comments

I interviewed David West, the author of the Object Thinking book, a few weeks ago, and he said that classes were not meant to be in object-oriented programming at all. He actually said that earlier; I just didn't understand him then. The more I've thought about this, the more it appears obvious that we indeed do not need classes.

Battleship Potemkin (1925) by Sergei M. Eisenstein
Battleship Potemkin (1925) by Sergei M. Eisenstein

Here is a prototype.

Let's say we have only types and objects. First, we define a type:

type Book {
  void print();
}

Then we create an object (pay attention; we don't "instantiate"):

Book b1 = create Book("Object Thinking") {
  String title;
  Book(String t) {
    this.title = t;
  }
  public void print() {
    print("My title: " + this.title);
  }
}

Then we create another object, which will behave similarly to the one we already have but with different constructor arguments. We copy an existing one:

Book b2 = copy b1("Elegant Objects");

Libraries will deliver us objects, which we can copy.

That's it.

No implementation inheritance and no static methods, of course. Only subtyping.

Why not?

© Yegor Bugayenko 2014–2018

Inheritance Is a Procedural Technique for Code Reuse

QR code

Inheritance Is a Procedural Technique for Code Reuse

  • Palo Alto, CA
  • comments

We all know that inheritance is bad and that composition over inheritance is a good idea, but do we really understand why? In most all articles I've found addressing this subject, authors have said that inheritance may be harmful to your code, so it's better not to use it. This "better" part is what bothers me; does it mean that sometimes inheritance makes sense? I interviewed David West (the author of Object Thinking, my favorite book about OOP) a few weeks ago, and he said that inheritance should not exist in object-oriented programming at all (full video). Maybe Dr. West is right and we should totally forget extends keyword in Java, for example.

Death at a Funeral (2007) by Frank Oz
Death at a Funeral (2007) by Frank Oz

I think we should. And I think I know the reason why.

It's not because we introduce unnecessary coupling, as Allen Holub said in his Why extends is evil article. He was definitely right, but I believe it's not the root cause of the problem.

"Inherit," as an English verb, has a number of meanings. This one is what inheritance inventors in Simula had in mind, I guess: "Derive (a quality, characteristic, or predisposition) genetically from one's parents or ancestors."

Deriving a characteristic from another object is a great idea, and it's called subtyping. It perfectly fits into OOP and actually enables polymorphism: An object of class Article inherits all characteristics of objects in class Manuscript and adds its own. For example, it inherits an ability to print itself and adds an ability to submit itself to a conference:

interface Manuscript {
  void print(Console console);
}
interface Article extends Manuscript {
  void submit(Conference cnf);
}

This is subtyping, and it's a perfect technique; whenever a manuscript is required, we can provide an article and nobody will notice anything, because type Article is a subtype of type Manuscript (Liskov substitution principle).

But what does copying methods and attributes from a parent class to a child one have to do with "deriving characteristics?" Implementation inheritance is exactly that---copying---and it has nothing to do with the meaning of the word "inherit" I quoted above.

Implementation inheritance is much closer to a different meaning: "Receive (money, property, or a title) as an heir at the death of the previous holder." Who is dead, you ask? An object is dead if it allows other objects to inherit its encapsulated code and data. This is implementation inheritance:

class Manuscript {
  protected String body;
  void print(Console console) {
    console.println(this.body);
  }
}
class Article extends Manuscript {
  void submit(Conference cnf) {
    cnf.send(this.body);
  }
}

Class Article copies method print() and attribute body from class Manuscript, as if it's not a living organism, but rather a dead one from which we can inherit its parts, "money, properties, or a title."

badge

Implementation inheritance was created as a mechanism for code reuse, and it doesn't fit into OOP at all. Yes, it may look convenient in the beginning, but it is absolutely wrong in terms of object thinking. Just like getters and setters, implementation inheritance turns objects into containers with data and procedures. Of course, it's convenient to copy some of those data and procedures to a new object in order to avoid code duplication. But this is not what objects are about. They are not dead; they are alive!

Don't kill them with inheritance :)

Thus, I think inheritance is bad because it is a procedural technique for code reuse. It comes as no surprise that it introduces all the problems people have been talking about for years. Because it is procedural! That's why it doesn't fit into object-oriented programming.

By the way, we discussed this problem in our Gitter chat (it's dead already) a week ago, and that's when it became obvious to me what exactly is wrong with inheritance. Take a look at our discussion there.

© Yegor Bugayenko 2014–2018

Gradients of Immutability

QR code

Gradients of Immutability

  • Palo Alto, CA
  • comments

Good objects are immutable, but not necessarily constants. I tried to explain it here, here, and here, but now it's time to make another attempt. Actually, the more I think about it, the more I realize that immutability is not black or white---there are a few more gradients; let's take a look.

Twelve Monkeys (1995) by Terry Gilliam
Twelve Monkeys (1995) by Terry Gilliam

As we agreed here, an object is a representative of someone else (some entity or entities, other object(s), data, memory, files, etc.). Let's examine a number of objects that look exactly the same to us but represent different things, then analyze how immutable they are and why.

Constant

This is constant; it doesn't allow any modifications to the encapsulated entity and always returns the same text (I've skipped constructors for the sake of brevity):

class Book {
  private final String ttl;
  Book rename(String title) {
    return new Book(title);
  }
  String title() {
    return this.ttl;
  }
}

This is what we usually have in mind when talking about immutable objects. Such a class is very close to a pure function, which means that no matter how many times we instantiate it with the same initial values, the result of title() will be the same.

Not a Constant

Check out this one:

class Book {
  private final String ttl;
  Book rename(String title) {
    return new Book(title);
  }
  String title() {
    return String.format(
      "%s (as of %tR)", this.ttl, new Date()
    );
  }
}

The object is still immutable, but it is not a pure function anymore because of the method title()---it returns different values if we call it multiple times with at least a one-minute interval. The object is immutable; it's just not a constant anymore.

Represented Mutability

How about this one:

class Book {
  private final Path path;
  Book rename(String title) {
    Files.write(
      this.path,
      title.getBytes(),
      StandardOpenOption.CREATE
    );
    return this;
  }
  String title() {
    return new String(
      Files.readAllBytes(this.path)
    );
  }
}

This immutable object keeps the book title in a file. It's not a constant, because its method title() may return different values on every second call. Moreover, the represented entity (the file) is not a constant. We can't say whether it's mutable or immutable, as we don't know how Files.write() is implemented. But we know for sure that it's not a constant, because it accepts change requests.

Encapsulated Mutability

An immutable object may not only represent but even encapsulate a mutable one. Just like in the previous example, a mutable file was encapsulated. Even though it was represented by the immutable class Path, the real file on disk was mutable. We can do the same, but in memory:

class Book {
  private final StringBuffer buffer;
  Book rename(String title) {
    this.buffer.setLength(0);
    this.buffer.append(title);
    return this;
  }
  String title() {
    return this.buffer.toString();
  }
}

The object is still immutable. Is it thread-safe? No. Is it a constant? No. Is it immutable? Yes. Confused? You bet.


My point is that immutability is not binary; there are many forms of it. The most simple one is, of course, a constant. Constants are almost the same as pure functions in functional programming. But object-oriented programming allows us to take a few steps forward and give immutable objects more permissions and flexibility. In OOP, we may have many more forms of immutability.

What is common among all these examples is that our objects are loyal to the entities they encapsulate. There are no setters that could change them. All encapsulated objects are final.

This is the only quality that differentiates mutable objects from immutable ones. The latter are always loyal to the entities they encapsulate and represent. For all the rest ... it depends.


If you like this article, you will definitely like these very relevant posts too:

Objects Should Be Immutable
The article gives arguments about why classes/objects in object-oriented programming have to be immutable, i.e. never modify their encapsulated state

How an Immutable Object Can Have State and Behavior?
Object state and behavior are two very different things, and confusing the two often leads to incorrect design.

Immutable Objects Are Not Dumb
Immutable objects are not the same as passive data structures without setters, despite a very common mis-belief.

© Yegor Bugayenko 2014–2018

Vertical vs. Horizontal Decomposition of Responsibility

QR code

Vertical vs. Horizontal Decomposition of Responsibility

  • Palo Alto, CA
  • comments

Objects responsible for too many things are a problem. Because their complexity is high, they are difficult to maintain and extend. Decomposition of responsibility is what we do in order to break these overly complex objects into smaller ones. I see two types of this refactoring operation: vertical and horizontal. And I believe the former is better than the latter.

Once Upon a Time in America (1984) by Sergio Leone
Once Upon a Time in America (1984) by Sergio Leone

Let's say this is our code (it is Ruby):

class Log
  def initialize(path)
    @file = IO.new(path, 'a')
  end
  def put(text)
    line = Time.now.strftime("%d/%m/%Y %H:%M ") + text
    @file.puts line
  end
end

Obviously, objects of this class are doing too much. They save log lines to the file and also format them---an obvious violation of a famous single responsibility principle. An object of this class would be responsible for too many things. We have to extract some functionality out of it and put that into another object(s). We have to decompose its responsibility. No matter where we put it, this is how the Log class will look after the extraction:

class Log
  def initialize(path)
    @file = IO.new(path, 'a')
  end
  def put(line)
    @file.puts line
  end
end

Now it only saves lines to the file, which is perfect. The class is cohesive and small. Let's make an instance of it:

log = Log.new('/tmp/log.txt')

Next, where do we put the lines with formatting functionality that were just extracted? There are two approaches to decompose responsibility: horizontal and vertical. This one is horizontal:

class Line
  def initialize(text)
    @line = text
  end
  def to_s
    Time.now.strftime("%d/%m/%Y %H:%M ") + text
  end
end

In order to use Log and Line together, we have to do this:

log.put(Line.new("Hello, world"))

See why it's horizontal? Because this script sees them both. They both are on the same level of visibility. We will always have to communicate with both of them when we want to log a line. Both objects of Log and Line are in front of us. We have to deal with two classes in order to log a line:

PlantUML SVG diagram

To the contrary, this decomposition of responsibility is vertical:

class TimedLog
  def initialize(log)
    @origin = log
  end
  def put(text)
    @origin.put(Time.now.strftime("%d/%m/%Y %H:%M ") + text)
  end
end

Class TimedLog is a decorator, and this is how we use them together:

log = TimedLog.new(log)

Now, we just put a line in the log:

log.put("Hello, world")

The responsibility is decomposed vertically. We still have one entry point into the log object, but the object "consists" of two objects, one wrapped into another:

PlantUML SVG diagram

In general, I think horizontal decomposition of responsibility is a bad idea, while vertical is a much better one. That's because a vertically decomposed object decreases complexity, while a horizontally decomposed one actually makes things more complex because its clients have to deal with more dependencies and more points of contact.

© Yegor Bugayenko 2014–2018

Eight Levels of Communication Maturity

QR code

Eight Levels of Communication Maturity

  • Tallinn, Estonia
  • comments

Each software team organizes its communications in its own specific way. Some use Slack, Trello, or GitHub; others just sit together in the same room. There are many methods and tools. I believe it's possible to rank them by the amount of damage they cause to your project. This is the list of all of them I'm aware of at the moment.

Schizopolis (1996) by Steven Soderbergh
Schizopolis (1996) by Steven Soderbergh

The damage I'm talking about is caused mostly by the distance between these communication channels and project artifacts. The farther away people stay from documents, the bigger the risk of losing information. And lost information is the first source of trouble in any project.

Here is the list; it starts with the most damaging communication means and goes down to the most mature and professional ones, which cause the least amount of trouble:

  • Coffee Breaks. This is the most dangerous thing---you can never keep track of them, you won't know what they were about, and there is no "search" button anywhere. Everything you say standing next to that coffee machine will be lost. Nothing will be converted to project artifacts.

  • Phone Calls. A bit better than coffee breaks but still a big issue. Phone calls are completely untraceable. Information you exchange on those calls is gone forever. Well, you can record them, but searching through phone call records is a tough task that nobody will do, ever.

  • Meetings. This is the next step after coffee breaks, because there is some structure and minutes. Meetings can be recorded (both on and offline), with their results filed somewhere and decisions documented. In reality, none of that will actually happen. Meetings will just kill your time and your sponsor's money.

  • Emails. If you can put some formality into emails and discipline all participants, your email history may be considered a project artifact in itself. How organized and easily browseable will that artifact be? That's a good question. In most cases, it will just be a mess.

  • Mailing Lists. They are better than emails, because some software is archiving them and making them available and browseable. But it will be difficult to find where exactly what topic was discussed, where decisions were made and why, who suggested what, etc.

  • Slack. There are many similar alternatives that are basically online chats. The main problem with all of them is that it's difficult to categorize such a chat, group messages together, or find something later. It's merely a flow of information that becomes useless just a few days after. Of course, if you really want to find something there, it's possible. But the quality of such a "document" is very low.

  • Trello. By Trello I mean any task/ticket tracking system---they are great instruments to immediately turn conversations and discussions into project artifacts. You don't need to document anything; it's already there. The problem is that they are still rather far away from the main project artifact: the source code with its commits, merge conflicts, build logs, etc.

  • GitHub. This is the best instrument you can use. It integrates communications with the product itself. The code you write and the discussions you have around it are literally in the same place.

Which one of these is your project using right now? I would strongly recommend you stay away from communication channels at the top of this list.

© Yegor Bugayenko 2014–2018

What's Wrong With Object-Oriented Programming?

QR code

What's Wrong With Object-Oriented Programming?

  • Palo Alto, CA
  • comments

Recently, I was trying to convince a few of my readers that a better understanding of an object in OOP would help us solve many problems in existing pseudo-object-oriented languages. Then, suddenly, the question came up: "What problems?" I was puzzled. I thought it was obvious that the vast majority of modern software written in modern OO languages is unmaintainable and simply a mess. So I Googled a bit, and this is what I found (in chronological order).

Swingers (1996) by Doug Liman
Swingers (1996) by Doug Liman

The list of quotes is sorted in chronological order, with the oldest on the top:

Edsger W. Dijkstra
Edsger W. Dijkstra (1989)
"TUG LINES," Issue 32, August 1989
"Object oriented programs are offered as alternatives to correct ones" and "Object-oriented programming is an exceptionally bad idea which could only have originated in California."

Alan Kay
Alan Kay (1997)
The Computer Revolution hasn't happened yet
"I invented the term object-oriented, and I can tell you I did not have C++ in mind." and "Java and C++ make you think that the new ideas are like the old ones. Java is the most distressing thing to happen to computing since MS-DOS." (proof)

Paul Graham
Paul Graham (2003)
The Hundred-Year Language
"Object-oriented programming offers a sustainable way to write spaghetti code."

Richard Mansfield
Richard Mansfield (2005)
Has OOP Failed?
"With OOP-inflected programming languages, computer software becomes more verbose, less readable, less descriptive, and harder to modify and maintain."

Eric Raymond
Eric Raymond (2005)
The Art of UNIX Programming
"The OO design concept initially proved valuable in the design of graphics systems, graphical user interfaces, and certain kinds of simulation. To the surprise and gradual disillusionment of many, it has proven difficult to demonstrate significant benefits of OO outside those areas."

Jeff Atwood
Jeff Atwood (2007)
Your Code: OOP or POO?
"OO seems to bring at least as many problems to the table as it solves."

Linus Torvalds
Linus Torvalds (2007)
this email
"C++ is a horrible language. ... C++ leads to really, really bad design choices. ... In other words, the only way to do good, efficient, and system-level and portable C++ ends up to limit yourself to all the things that are basically available in C. And limiting your project to C means that people don't screw that up, and also means that you get a lot of programmers that do actually understand low-level issues and don't screw things up with any idiotic "object model" crap."

Oscar Nierstrasz
Oscar Nierstrasz (2010)
Ten Things I Hate About Object-Oriented Programming
"OOP is about taming complexity through modeling, but we have not mastered this yet, possibly because we have difficulty distinguishing real and accidental complexity."

Rich Hickey
Rich Hickey (2010)
SE Radio, Episode 158
"I think that large objected-oriented programs struggle with increasing complexity as you build this large object graph of mutable objects. You know, trying to understand and keep in your mind what will happen when you call a method and what will the side effects be."

Eric Allman
Eric Allman (2011)
Programming Isn't Fun Any More
"I used to be enamored of object-oriented programming. I'm now finding myself leaning toward believing that it is a plot designed to destroy joy. The methodology looks clean and elegant at first, but when you actually get into real programs they rapidly turn into horrid messes."

Joe Armstrong
Joe Armstrong (2011)
Why OO Sucks
"Objects bind functions and data structures together in indivisible units. I think this is a fundamental error since functions and data structures belong in totally different worlds."

Rob Pike
Rob Pike (2012)
here
"Object-oriented programming, whose essence is nothing more than programming using data with associated behaviors, is a powerful idea. It truly is. But it's not always the best idea. ... Sometimes data is just data and functions are just functions."

John Barker
John Barker (2013)
All evidence points to OOP being bullshit
"What OOP introduces are abstractions that attempt to improve code sharing and security. In many ways, it is still essentially procedural code."

Lawrence Krubner
Lawrence Krubner (2014)
Object Oriented Programming is an expensive disaster which must end
"We now know that OOP is an experiment that failed. It is time to move on. It is time that we, as a community, admit that this idea has failed us, and we must give up on it."

Asaf Shelly
Asaf Shelly (2015)
Flaws of Object Oriented Modeling
"Reading an object oriented code you can't see the big picture and it is often impossible to review all the small functions that call the one function that you modified."


If you have something to add to this list, please post a comment below.

© Yegor Bugayenko 2014–2018

If-Then-Else Is a Code Smell

QR code

If-Then-Else Is a Code Smell

  • Tallinn, Estonia
  • comments

In most cases (maybe even in all of them), if-then-else can and must be replaced by a decorator or simply another object. I've been planning to write about this for almost a year but only today found a real case in my own code that perfectly illustrates the problem. So it's time to demonstrate it and explain.

Fargo (1996) by Coen Brothers
Fargo (1996) by Coen Brothers

Take a look at the class DyTalk from yegor256/rultor and its method modify(). In a nutshell, it prevents you from saving any data to the DynamoDB if there were no modifications of the XML document. It's a valid case, and it has to be validated, but the way it's implemented is simply wrong. This is how it works (an oversimplified example):

class DyTalk implements Talk {
  void modify(Collection<Directive> dirs) {
    if (!dirs.isEmpty()) {
      // Apply the modification
      // and save the new XML document
      // to the DynamoDB table.
    }
  }
}

What's wrong, you wonder? This if-then-else forking functionality doesn't really belong to this object---that's what's wrong. Modifying the XML document and saving it to the database is its functionality, while not saving anything if the modification instructions set is empty is not (it's very similar to defensive programming). Instead, there should be a decorator, which would look like this:

class QuickTalk implements Talk {
  private final Talk origin;
  void modify(Collection<Directive> dirs) {
    if (!dirs.isEmpty()) {
      this.origin.modify(dirs);
    }
  }
}

Now, if and when we need our talk to be more clever in situations where the list of directives is empty, we decorate it with QuickTalk. The benefits are obvious: the DyTalk class is smaller and therefore more cohesive.

But the question is bigger than just that. Can we make a rule out of it? Can we say that each and every forking is bad and should be moved out of a class? What about forking that happens inside a method and can't be converted to a decorator?

I'm suggesting this simple rule: If it's possible to convert if-then-else forking to a decorator, it has to be done. If it's not done, it's a code smell. Make sense?

© Yegor Bugayenko 2014–2018

A Distributed Team Delivers Code of Higher Quality

QR code

A Distributed Team Delivers Code of Higher Quality

  • Las Vegas, NV
  • comments

OK, the title is not exactly accurate. I've missed the "can" word. A distributed team can deliver code of much higher quality than a co-located one, and now I'll explain why. Of course, not every distributed team can do that. Most of them can't even deliver code that works, let alone quality code. But if a team---a distributed one---is managed according to the principles I'll explain now, the quality will be much higher than the same team can achieve if co-located. What I'm going to show you is that working in a remote mode, if done right, guarantees higher quality of code. Surprised?

Ocean's Twelve (2004) by Steven Soderbergh
Ocean's Twelve (2004) by Steven Soderbergh

There are basically four simple ingredients to success ... you know what, there is actually one main ingredient, and its name is control. If we want quality to be at some level, we have to enforce it. We can't just declare it; we need to make it high.

How do software teams make high-quality code? Oh, there are many proven methods. First, you need a very modern office where programmers developers sit on cushioned chairs, play table tennis, drink smoothies, and write diagrams on walls. Second, you should buy them many books. Books have to be everywhere in the office, and they have to be about everything from Python and Haskell to Docker, Agile, and lean startups. The more books, the higher the quality of the code they write. And third, you have to pay them well. The more expensive the developer is, the higher the quality he or she writes.

I'm sure you understand that I'm joking. None of these "proven methods" will either guarantee quality or motivate serious software engineers. Quality can be achieved only if it is controlled and enforced. And this is also what motivates programmers best of all---the fact that the quality is so important for management that they find mechanisms of control and enforcement, and they invest in them. Table tennis and lean startup books are not even close to those mechanisms.

So, now let's discuss those four ingredients of quality enforcement, which we practice in our projects:

  • Read-Only Master Branch. Nobody can make changes directly to the master branch; neither the architect nor the project sponsor. The master branch is technically read-only. This means that in order to compromise the quality of our code, everyone has to go through a pull request, pre-flight build, and automated merge procedure.

  • Chats Are Prohibited. Any modification to our code base, even a very small one, must be submitted in a pull request. A code review must also occur in the pull request. We strictly disallow any informal communications between programmers, including chats, phone calls, emails, or face-to-face discussions. This means that the chances of quality compromises due to friendship, informal agreements, and trade-offs are very low.

  • Build Is Fragile. We believe that the higher the quality bar, the more difficult it is to modify any piece of code without breaking the build. We put a lot of quality checks right into the build to make the lives of programmers more difficult. Well, this is not our goal, but it happens. The code has to pass all static analysis checks, a test coverage threshold, mutation coverage threshold, and many others. This means that bad code won't reach the repo, ever.

  • Micro Payments for Deliverables. We pay only for closed tickets, and they are each very small (up to two hours). We don't pay for time spent in the office or in front of the computer. We pay only when tickets are closed---no close, no pay. This means that programmers are motivated to close them, nothing else.

Thus, as you can see, there is an intentionally created conflict. On one hand, programmers have to close tickets and deliver working code. On the other hand, it's rather difficult to do; because the quality bar is high, there is no room to make compromises, and there is no technical possibility to go around an issue. Good programmers survive in this conflict and manage to deliver and get paid. Well-paid.

And now, to the main point of this blog post---do you think it's possible to build all that in a co-located team? I don't think so. First of all, you won't be able to prohibit informal communications. No matter how many times you ask developers to communicate in tickets, they will resolve most of their technical questions face-to-face. It's inevitable.

Second, you won't be able to pay for results only, because programmers will complain that they are doing a lot of communication in the office that has to be paid somehow. In reality, they will spend two to three hours per day on actually writing code, and the rest of the time will be spent on coffee breaks, Trump talks, and Facebook scrolling. Again, it's inevitable.

And third, people are people. Nobody likes to hit that quality bar multiple times a day. They will complain, and eventually you will give them direct access to the master branch. First, you will give it to the architect, then to a few senior developers, then to a few good friends who you absolutely trust. Then to everybody, just in case. It's inevitable.

To summarize, I believe that co-located teams are just not made for quality programming. For fun---yes. For creativity---maybe. For burning investors' money---absolutely. For quality---not really.

© Yegor Bugayenko 2014–2018

8+2 Maturity Levels of Continuous Integration

QR code

8+2 Maturity Levels of Continuous Integration

  • Palo Alto, CA
  • comments

There are a number of levels you have to go through before your continuous integration pipeline becomes perfect. I found eight of them and presented my findings at DevOpsDays in Salt Lake City a few weeks ago (watch the video). Now it's time to write them down and ask you---Which level are you at? Post your answer below.

Twins (1988) by Ivan Reitman
Twins (1988) by Ivan Reitman
  1. Source Code. Here you just write source code on your computers and maybe somewhere on the server. The best you can do here is to build it manually, say, every day. Is it continuous integration? Well, to some extent, provided you don't forget to compile and package everything regularly.

  2. Automated Build. At this level, your build is automated, which means you can compile, test, and package the entire product with just one line at the command line. Pay attention; one line. You must be able to hit one button and either get an error or a successful build.

  3. Git. At this level, you keep your source code in Git. You can keep it in some other source control system, but that would be strange---Git is the status quo at the moment. You should be able to get a new computer, with nothing in it, check out the source code from a Git repository, and run a full build.

  4. Pull Requests. Each and every change to your source code must be submitted through a pull request, which means that you host your repository on GitHub. You may host it somewhere else, but again, that would be strange because GitHub is the status quo at the moment. Again, nobody should be able to commit anything directly to the master branch except through forks and pull requests.

  5. Code Reviews. Every pull request must pass a mandatory code review before it gets merged. You must have some code review policy that explains who does reviews, what happens if the author doesn't agree with the reviewer, etc. But no pull request may be merged unless it has been reviewed.

  6. Tests. At this level, your code is covered by unit tests (and integration tests), and every change comes with a new test. Your automated build runs all tests together and fails if any of them fail.

  7. Static Analysis. Checking the quality of your code without running it is what static analysis is about. At this level, the quality of your code is checked by the automated build. If the quality is lower than the threshold, the build fails.

  8. Pre-Flight Builds. This idea is explained here.

  9. Production Simulation. The build is run in a container, which simulates production environment and data.

  10. Stress Tests. Performance and stress tests are automated and executed on every build.

By the way, at the presentation, I also mentioned what problems you may encounter at each maturity level.

© Yegor Bugayenko 2014–2018

ActiveRecord Is Even Worse Than ORM

QR code

ActiveRecord Is Even Worse Than ORM

  • Palo Alto, CA
  • comments

You probably remember what I think about ORM, a very popular design pattern. In a nutshell, it encourages us to turn objects into DTOs, which are anemic, passive, and not objects at all. The consequences are usually dramatic---the entire programming paradigm shifts from object-oriented to procedural. I've tried to explain this at a JPoint and JEEConf this year. After each talk, a few people told me that what I'm suggesting is called ActiveRecord or Repository patterns.

En duva satt på en gren och funderade på tillvaron (2014) by Roy Andersson
En duva satt på en gren och funderade på tillvaron (2014) by Roy Andersson

Moreover, they claimed that ActiveRecord actually solves the problem I've found in ORM. They said I should explain in my talks that what I'm offering (SQL-speaking objects) already exists and has a name: ActiveRecord.

I disagree. Moreover, I think that ActiveRecord is even worse than ORM.

ORM consists of two parts: the session and DTOs, also known as "entities." The entities have no functionality; they are just primitive containers for the data transferred from and to the session. And that is what the problem is---objects don't encapsulate but rather expose data. To understand why this is wrong and why it's against the object paradigm, you can read here, here, here, here, and here. Now, let's just agree that it's very wrong and move on.

What solution is ActiveRecord proposing? How is it solving the problem? It moves the engine into the parent class, which all our entities inherit from. This is how we were supposed to save our entity to the database in the ORM scenario (pseudo-code):

book.setTitle("Java in a Nutshell");
session.update(book);

And this is what we do with an ActiveRecord:

book.setTitle("Java in a Nutshell");
book.update();

The method update() is defined in book's parent class and uses book as a data container. When called, it fetches data from the container (the book) and updates the database. How is it different than ORM? There is absolutely no difference. The book is still a container that knows nothing about SQL and any persistence mechanisms.

What's even worse in ActiveRecord, compared to ORM, is that it hides the fact that objects are data containers. A book, in the second snippet, pretends to be a proper object, while in reality it's just a dumb data bag.

I believe this is what misguided those who were saying that my SQL-speaking objects concept is exactly the same as the ActiveRecord design pattern (or Repository, which is almost exactly the same).

No, it's not.

© Yegor Bugayenko 2014–2018

Convince Me!

QR code

Convince Me!

  • Palo Alto, CA
  • comments

I've already explained how I understand the role and responsibilities of a software architect. But one question still remains unanswered, and it often turns into a problem in our projects: What does a software architect do when the project sponsor doesn't like his technical decisions? The architect implements something in a certain way, and the sponsor (or its representative) says that it's not exactly how things should work. What's next?

Beasts of No Nation (2015) by Cary Joji Fukunaga
Beasts of No Nation (2015) by Cary Joji Fukunaga

In our projects, a product owner (PO) is usually a representative of a project sponsor (the paying customer). Since all our projects are rather complex Java software packages, POs are very technical people. They are programmers or used to be programmers. They understand the code we write, and they want their opinion to be taken into account and respected.

And I'm not talking about stupid product owners---those guys are a separate story. I'm talking about a pretty reasonable PO with his own technical opinion that needs to be heard.

Here is a practical example. Last week, I was starting a project. I was an architect. It was a Java server-side module. I decided to use Maven as a build automation system.

I created some initial files, configured pom.xml, briefly explained the project structure in README.md, and submitted a pull request. Chris, the product owner, reviewed it and asked, "Why not Gradle?"

It was a reasonable question, right? Gradle is another popular build automation system that I could have used, but I didn't. The question is why. It was a pretty innocent question, and I explained the answer right there in my comment to the pull request. I said Maven was more suitable in this project because ... blah blah blah.

But Chris argued back. He was still thinking Gradle was the better choice. He had his reasons. Meanwhile, I tried to convince him of mine. I tried a few times and then realized I was doing something wrong. It shouldn't work like that.

A software architect should not convince a product owner, a customer, or anybody else. Instead, an architect must make his decisions and be responsible for the entire success or failure of the product, just like I explained before.

There is a simple reason for that. Any attempt to convince anyone causes a possibility of "responsibility leakage." What if I fail to convince? I will have to change my plan and use Gradle, right? What if the product has problems because of that decision? I will try to blame Chris for that, right? I can't be fully responsible for the product anymore, because I was "forced" to make at least one decision.

Don't get me wrong; a good architect must collect different opinions before making his own decision. But collecting Chris's opinion would look very different. I would ask him first what he thinks about Maven and Gradle. He would tell me that he doesn't like Maven because of this and that. And I would take that into account. Or maybe not. But my decision would still be mine, made by myself, under no compulsion by anybody. And Chris would still be able to blame me for any negative consequences of that decision.

But what should Chris do if he really doesn't like my decision? It's his money and his product, right? He does care. And he doesn't want to have Maven in his product. What does he do? How can he influence my decision-making process?

It's easy. There are two documents in each software project. The first one is requirements, and the second one is architecture. Chris should use them both to correct me and point me in the right direction. Here's how.

First, if he really doesn't want to have Maven, he should make changes to the requirements document. He should add something like "the build system must be Gradle, because ..." Or maybe even without the "because" part. It's up to him. In that case, I will have to take that into account, and I will. I know my design decisions are dictated by the requirements. And not because Chris convinced me or I failed to convince him, but because that's what the document says.

Second, if he is not entirely sure that Gradle is the right choice and just wants me to be more serious about my decisions, he should complain (by submitting a bug) about the quality of my architecture document. He should say the choice to go with Maven is not explained properly. I will then rethink my decision and will either change it or explain it better. But again, I will do it not to please Chris but to fix a reported bug.

To summarize, an architect must be an absolute technical dictator during the project and must not have to convince anyone. If that's not the case, the entire project faces big risks, simply because the responsibility will be "leaking."

© Yegor Bugayenko 2014–2018

The Law of Demeter Doesn't Mean One Dot

QR code

The Law of Demeter Doesn't Mean One Dot

  • Palo Alto, CA
  • comments

You've probably heard about that 30-year-old Law of Demeter (LoD). Someone asked me recently what I think about it. And not just what I think, but how it is possible to keep objects small and obey the LoD. According to the law, we're not allowed to do something like book.pages().last().text(). Instead, we're supposed to go with book.textOfLastPage(). It puzzled me, because I strongly disagree. I believe the first construct is perfectly valid in OOP. So I've done some research to find out whether this law is really a law. What I found out is that the law is perfect, but its common understanding in the OOP world is simply wrong (not surprisingly).

Spartacus (1960) by Stanley Kubrick
Spartacus (1960) by Stanley Kubrick

Object-Oriented Programming: An Objective Sense of Style, K.Lieberherr, I.Holland, and A.Riel, OOPSLA'88 Proceedings, 1988.

This is where it was introduced. Let's see what it literally says (look for Section 3 in that PDF document):

For all classes C, and for all methods M attached to C, all objects to which M sends a message must be instances of classes associated with the following classes: 1) the argument classes of M (including C), 2) the instance variable classes of C.

Say it's a Java class:

class C {
  private B b;
  void m(A a) {
    b.hello();
    a.hello();
    Singleton.INSTANCE.hello();
    new Z().hello();
  }
}

All four calls to four different hello() methods are legal, according to the LoD. So what would be illegal, I ask myself? No surprise; the answer is this: a.x.hello(). That would be illegal. Directly accessing the attribute from another object and then talking to it is not allowed by the law.

But we don't do that anyway. We're talking about book.pages().last().text(). In this chain of method calls, we're not accessing any attributes. We're asking our objects to build new objects for us. What does the law say about that? Let me read it and quote:

Objects created by M, or by functions or methods that M calls, are considered as arguments of M

In other words, the object Pages that method call book.pages() returns is a perfectly valid object that can be used. Then, we can call method last() on it and get an object Page, and then call method text(), etc. This is a perfectly valid scenario that doesn't violate the law at all, just as I expected.

So where does this common understanding of the law come from? Why does Wikipedia call it a rule of "one dot" and say that "an object should avoid invoking methods of a member object returned by another method?" This is absolutely to the contrary of what the original paper says! What's going on?

The answer is simple: getters.

The majority of OOP developers think most object methods that return anything are getters. And getters, indeed, are no different than direct access to object attributes. That's why Wikipedia actually says "no direct access to attributes and, since most of your methods are getters, don't touch them either, silly."

That's just sad to see.

So the bottom line is that the Law of Demeter is not against method chaining at all. Of course, it's against getters and direct attribute access. But who isn't, right?

© Yegor Bugayenko 2014–2018

Who Is an Object?

QR code

Who Is an Object?

  • Palo Alto, CA
  • comments

There are thousands of books about object-oriented programming and hundreds of object-oriented languages, and I believe most (read "all") of them give us an incorrect definition of an "object." That's why the entire OOP world is so full of misconceptions and mistakes. Their definition of an object is limited by the hardware architecture they are working with and that's why is very primitive and mechanical. I'd like to introduce a better one.

Jackass: The Movie (2002) by Jeff Tremaine
Jackass: The Movie (2002) by Jeff Tremaine

What is an object? I've done a little research, and this is what I've found:

  • "Objects may contain data, in the form of fields, often known as attributes; and code, in the form of procedures, often known as methods"---Wikipedia at the time of writing.

  • "An object stores its state in fields and exposes its behavior through methods"---What Is an Object? by Oracle.

  • "Each object looks quite a bit like a little computer---it has a state, and it has operations that you can ask it to perform"---Thinking in Java, 4th Ed., Bruce Eckel, p. 16.

  • "A class is a collection of data fields that hold values and methods that operate on those values"---Java in a Nutshell, 6th Ed., Evans and Flanagan, p. 98.

  • "An object is some memory that holds a value of some type"---The C++ Programming Language, 4th Ed., Bjarne Stroustrup, p. 40.

  • "An object consists of some private memory and a set of operations"---Smalltalk-80, Goldberg and Robson, p. 6.

What is common throughout all these definitions is the word "contains" (or "holds," "consists," "has," etc.). They all think that an object is a box with data. And this perspective is exactly what I'm strongly against.

If we look at how C++ or Java are implemented, such a definition of an object will sound technically correct. Indeed, for each object, Java Virtual Machine allocates a few bytes in memory in order to store object attributes there. Thus, we can technically say, in that language, that an object is an in-memory box with data.

Right, but this is just a corner case!

Let's try to imagine another object-oriented language that doesn't store object attributes in memory. Confused? Bear with me for a minute. Let's say that in that language we define an object:

c {
  vin: v,
  engine: e
}

Here, vin and engine are attributes of object c (it's a car; let's forget about classes for now to focus strictly on objects). Thus, there is a simple object that has two attributes. The first one is car's VIN, and the second one is its engine. The VIN is an object v, while the engine is e. To make it easier to understand, this is how a similar object would look in Java:

char[] v = {'W','D','B','H',...'7','2','8','8'}; // 17 chars
Engine e = new Engine();
Car c = new Car(v, e);

I'm not entirely sure about JVM, but in C++ such an object will take exactly 25 bytes in memory (assuming it's 64-bit x86 architecture). The first 17 bytes will be taken by the array of chars and another 8 bytes by a pointer to the block in memory with object e. That's how the C++ compiler understands objects and translates them to the x86 architecture. In C++, objects are just data structures with clearly defined allocation of data attributes.

In that example, attributes vin and engine are not equal: vin is "data," while engine is a "pointer" to another object. I intentionally made it this way in order to demonstrate that calling an object a box with data is possible only with vin. Only when the data are located right "inside" the object can we say that the object is actually a box for the data. With engine, it isn't really true because there is no data technically inside the object. Instead, there is a pointer to another object. If our object would only have an engine attribute, it would take just 8 bytes in memory, with none of them actually occupied by "data."

Now, let's get back to our new pseudo language. Let's imagine it treats objects very differently than C++---it doesn't keep object attributes in memory at all. It doesn't have pointers, and it doesn't know anything about x86 architecture. It just knows somehow what attributes belong to an object.

Thus, in our language, objects are no longer boxes with data both technically and conceptually. They know where the data is, but they don't contain the data. They represent the data, as well as other objects and entities. Indeed, the object c in our imaginary language represents two other objects: a VIN and an engine.

To summarize, we have to understand that even though a mechanical definition of an object is correct in most programming languages on the market at the moment, it is very incorrect conceptually because it treats an object as a box with data that are too visible to the outside world. That visibility provokes us to think procedurally and try to access that data as much as possible.

badge

If we would think of an object as a representative of data instead of a container of them, we would not want to get a hold of data as soon as possible. We would understand that the data are far away and we can't just easily touch them. We should communicate with an object---and how exactly it communicates with the data is not our concern.

I hope that in the near future, the market will introduce new object-oriented languages that won't store objects as in-memory data structures, even technically.

By the way, here is the definition of an object from my favorite book, Object Thinking by David West, p. 66:

An object is the equivalent of the quanta from which the universe is constructed

What do you think? Is it close to the "representative" definition I just proposed?

© Yegor Bugayenko 2014–2018

Twelve Mistakes in Agile Manifesto

QR code

Twelve Mistakes in Agile Manifesto

  • Palo Alto, CA
  • comments

Nowadays, Agile Manifesto is a Bible of numerous software teams. It contains 12 principles which show us how software development should be organized. These principles were invented in 2001. Generally, I like and agree with all of them. However, in practice, most software teams misunderstand them. Consequently, here is a summary of what's going on and my interpretation of each principle.

Hail, Caesar! (2016) by Coen Brothers
Hail, Caesar! (2016) by Coen Brothers

Principle #1: "Our highest priority is to satisfy the customer through early and continuous delivery of valuable software."

By focusing on "satisfy the customer," Agile adepts totally forget about the "through" part. They think that a happy customer is their true objective, while "continuous delivery" is something that obviously helps, though not crucially. However, this is quite the opposite---the customer will be satisfied if the software is perfectly created and delivered. If the customer is not satisfied, we find another customer---that's the true spirit a professional software team should adhere to. I believe that's what the Manifesto means. We make sure that our process is "early and continuous," which will result to customer satisfaction. We focus on improving our process, not satisfying the customer. Satisfaction is the consequence, not the primary objective.

Principle #2: "Welcome changing requirements, even late in development. Agile processes harness change for the customer's competitive advantage."

Most Agile teams understand the word "welcome" here as a permission to forget about any requirements management at all. What is the easiest way to welcome change? Obviously, just get rid of any requirement documents! In this case, any change will be welcome, since it won't affect anything. There simply won't be anything to affect. But this is not what the Manifesto means! This principle means that our requirements management process is so powerful that it can accept change at any moment. However, it's rather difficult to achieve, if requirements are actually documented.

Principle #3: "Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale."

This terrific rule is usually understood as an order for the entire team. The team has to frequently deliver, while programmers are free to deliver almost nothing and who knows when. I think the Manifesto here is emphasizing on both individual and group responsibilities to frequently deliver. I also think that this frequency should be way higher than just a "couple of weeks." Today, with modern technologies and instruments, we can deliver way faster---several times a day.

Principle #4: "Business people and developers must work together daily throughout the project."

Working together doesn't mean working without clearly defined rules and processes. However, most teams understand this principle as a legalization of chaos. They think that since we work together, we don't need to define roles any more, we should not document requirements, we shouldn't care about responsibilities. Ultimately in the end, we neither know who is doing what nor the team's structure. That's not what the Manifesto is talking about! "Working together" means quicker turnarounds in communication and shorter response cycles. It definitely doesn't mean lack of roles and responsibilities.

Principle #5: "Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done."

Trust is a great word and concept, but it doesn't replace another equally great word---control. Most Agile teams think that trust means exactly that---complete lack of any validation, verification, responsibility, and control. "We trust our programmers to write perfect codes"---I've heard that countless times which is simply wrong. This principle means something completely different. It means that when clearly defined tasks are assigned to their performers, we fully delegate responsibilities to them. We motivate them to be fully responsible for the end result. However, we don't help them. Instead, we trust them as self-sufficient individuals, capable of completing assigned tasks on their own.

Principle #6: "The most efficient and effective method of conveying information to and within a development team is face-to-face conversation."

Face-to-face doesn't mean sitting in the same office. The Manifesto doesn't say anything about co-located or distributed teams. It's obvious that in modern software projects, virtual communications (over video calls) are way more effective than staying together in the same country, same city, same office, and same room. Therefore, most Agile adepts still promote on-site development style, using Agile Manifesto as proof. That's a mistake; face-to-face means something totally different from what it meant 15 years ago, when the Manifesto was written.

Principle #7: "Working software is the primary measure of progress."

This doesn't mean that we should not measure anything else. Of course, the working software is the primary measure, but there are many other measures, which we can and must use. For example, the amount of features documented, implemented and delivered; or the amount of lines of code added to the project (don't smile, read); or the amount of bugs found; or the amounts of dollars spent. There are many other metrics. We can use many of them. However, a typical mistake many Agile teams are doing is just ignoring them all. They say "we measure only the end result." That's not what the Manifesto is suggesting to do though.

Principle #8: "Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely."

This doesn't mean that we should indefinitely burn customers' money. Yes, we should be developing at some given speed, but we should always remember whose money we're spending---customers' money. The Manifesto doesn't say anything about the cost of development and that's probably because it was written by those who make money (programmers), not those who spend it (customers). We must therefore remember that any project is first of all a money burning machine. That's why the team must always measure its burn rate and make sure it's aligned with the amount of business value the team delivers. Just being a happy team is not what the Manifesto suggests, but that's exactly how many understand this principle.

Principle #9: "Continuous attention to technical excellence and good design enhances agility."

That's a perfect principle that says so much and doesn't say anything at the same time. What exactly is "attention?" I can explain. It means rules and policies. First of all, any policy means punishment to those who violate rules. Thus, if an Agile team really means continuous attention to technical excellence, it must have a quality policy. That policy must clearly define which design is good and which is bad, which piece of Java code is excellent, which is ugly, etc. Additionally, the policy must say what happens to those who violate the principles of excellence. However, most Agile teams understand "quality" as a great flag to hang on the wall, but get scared when I ask, "what happens if someone delivers low quality?."

Principle #10: "Simplicity---the art of maximizing the amount of work not done---is essential."

That's a great rule which most Agile teams don't follow at all. This principle means that our tasks are small and simple enough to make sure they are either doable or cancellable. Huge tasks are the biggest threat to manageability of any team, be it Agile or not. This principle encourages us to give programmers small tasks, which they can easily be completed. However, most of Agile adepts simplicity being equal to stupidity. They are not equal. A simple task doesn't mean a stupid or non-important tasks. A simple task is a clearly defined, small, and doable work order.

Principle #11: "The best architectures, requirements, and designs emerge from self-organizing teams."

Self-organized doesn't mean un-organized. This rule is often translated as a legalization of anarchy. We don't need any project managers, processes, discipline, rules, or policies---we've got holacracy instead! We also don't need a software architect---our programmers can make all technical decisions at regular meetings! Furthermore, we don't want our programmers to be individually responsible for anything---they are always together in all risks and issues. Stop that nonsense! This is not what the Manifesto means. A self-organizing team is a team that doesn't need any supervision from the outside; a team that has clearly defined roles from the inside; a team with a perfect inner discipline; a team with professional management. Not with the lack of all that.

Principle #12: "At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly."

That's a great principle which is translated into so-called retrospective meetings. They work just fine as long as decisions make the team better. Unfortunately, in most cases, programmers in Agile teams are trying to survive, instead of making their teams more effective. Even though the principle says that the team has to become more effective, those retrospective meetings help programmers to become more effective (read "more secure") in the team. That's only natural for people, but leads to the overall degradation of the team. It's well known that the best team is the one that is capable of quickly and inevitably rejecting bad elements. Does your team do that effectively? Do retrospective meetings help in that? I doubt it. Therefore, I believe that what the Manifesto means here is not the meetings. It means that the team must have an effective mechanism of self-regulation and self-improvement. Additionally, retrospective meetings simply can't be that mechanism because they prevent the team from making difficult disciplinary decisions.

© Yegor Bugayenko 2014–2018

Key Roles in a Software Project

QR code

Key Roles in a Software Project

  • Palo Alto, CA
  • comments

I believe that several roles should be present in a majority of software projects. Managed by Zerocracy Zerocracy according to the principles of XDSD, we've got all of them in our projects. However, beware that in other management methodologies, these roles may have different meanings. This blog post is mostly for people who work with us, either as clients or freelancers.

12 Angry Men (1957) by Sidney Lumet
12 Angry Men (1957) by Sidney Lumet

There are just a few roles:

  • Project Manager (PM) is responsible for keeping the project under control. The PM reports to the head of our PMO.

  • Product Owner (PO) is a representative of the sponsor. The PO provides product requirements. The PO submits bugs and expresses any concerns or questions relating to them. Usually, the PO is a very technical person who knows how the product works and is capable of understanding the source code.

  • Software Architect (ARC) is responsible for the entire technical solution. The ARC is blamed for all technical problems. The ARC approves all pull requests before we can merge them. The ARC is the main point of contact in the project for the PO. The ARC makes all technical decisions. The ARC reports to the PM.

  • Developer (DEV) is a programmer and is responsible for closing bugs. The DEV reports to the PM.

  • Requirements Analyst (REQ) is responsible for the validation of the product. The REQ solicits requirements from the PO. The REQ demonstrates the product to the PO. The REQ submits new bugs when validation fails and the product needs changes. The REQ reports to the PM.

  • Quality Assurance (QA) oversees the correctness of our process. The QA approves each closed task before it's officially closed by the PM. The QA ensures that our process complies to our policy. The QA reports to the PM.

  • Tester (TST) manually tests the product, finds bugs, and reports them.

Besides all that, any role is encouraged to submit bugs when they find them.

© Yegor Bugayenko 2014–2018

Data Transfer Object Is a Shame

QR code

Data Transfer Object Is a Shame

  • Palo Alto, CA
  • comments

DTO, as far as I understand it, is a cornerstone of the ORM design pattern, which I simply "adore." But let's skip to the point: DTO is just a shame, and the man who invented it is just wrong. There is no excuse for what he has done.

Before the Devil Knows You're Dead (2007) by Sidney Lumet
Before the Devil Knows You're Dead (2007) by Sidney Lumet

By the way, his name, to my knowledge, was Martin Fowler. Maybe he was not the sole inventor of DTO, but he made it legal and recommended its use. With all due respect, he was just wrong.

The key idea of object-oriented programming is to hide data behind objects. This idea has a name: encapsulation. In OOP, data must not be visible. Objects must only have access to the data they encapsulate and never to the data encapsulated by other objects. There can be no arguing about this principle---it is what OOP is all about.

However, DTO runs completely against that principle.

Let's see a practical example. Say that this is a service that fetches a JSON document from some RESTful API and returns a DTO, which we can then store in the database:

Book book = api.loadBookById(123);
database.saveNewBook(book);

I guess this is what will happen inside the loadBookById() method:

Book loadBookById(int id) {
  JsonObject json = /* Load it from RESTful API */
  Book book = new Book();
  book.setISBN(json.getString("isbn"));
  book.setTitle(json.getString("title"));
  book.setAuthor(json.getString("author"));
  return book;
}

Am I right? I bet I am. It already looks disgusting to me. Anyway, let's continue. This is what will most likely happen in the saveNewBook() method (I'm using pure JDBC):

void saveNewBook(Book book) {
  Statement stmt = connection.prepareStatement(
    "INSERT INTO book VALUES (?, ?, ?)"
  );
  stmt.setString(1, book.getISBN());
  stmt.setString(2, book.getTitle());
  stmt.setString(3, book.getAuthor());
  stmt.execute();
}

This Book is a classic example of a data transfer object design pattern. All it does is transfer data between two pieces of code, two procedures. The object book is pretty dumb. All it knows how to do is ... nothing. It doesn't do anything. It is actually not an object at all but rather a passive and anemic data structure.

What is the right design? There are a few. For example, this one looks good to me:

Book book = api.bookById(123);
book.save(database);

This is what happens in bookById():

Book bookById(int id) {
  return new JsonBook(
    /* RESTful API access point */
  );
}

This is what happens in Book.save():

void save(Database db) {
  JsonObject json = /* Load it from RESTful API */
  db.createBook(
    json.getString("isbn"),
    json.getString("title"),
    json.getString("author")
  );
}

What happens if there are many more parameters of the book in JSON that won't fit nicely as parameters into a single createBook() method? How about this:

void save(Database db) {
  db.create()
    .withISBN(json.getString("isbn"))
    .withTitle(json.getString("title"))
    .withAuthor(json.getString("author"))
    .deploy();
}

There are many other options. But the main point is that the data never escapes the object book. Once the object is instantiated, the data is not visible or accessible by anyone else. We may only ask our object to save itself or to print itself to some media, but we will never get any data from it.

The very idea of DTO is wrong because it turns object-oriented code into procedural code. We have procedures that manipulate data, and DTO is just a box for that data. Don't think that way, and don't do that.

PS. There are a few other names of DTO: business objects, domain objects (not in DDD), entity objects, JavaBeans.

© Yegor Bugayenko 2014–2018

Singletons Must Die

QR code

Singletons Must Die

  • Los Angeles, CA
  • comments

I think it's too obvious to say that a singleton is an anti-pattern as there are tons of articles about that (singleton being an anti-pattern). However, more often than not, the question is how to define global things without a singleton; and the answer to that is not obvious for many of us. There are several examples: a database connection pool, a repository, a configuration map, etc. They all naturally seem to be "global"; but what do we do with them?

Perdita Durango (1997) by Álex de la Iglesia
Perdita Durango (1997) by Álex de la Iglesia

I assume you already know what a singleton is and why it's an anti-pattern. If not, I recommend you read this StackOverflow thread: What is so bad about singletons?

Now that we agree it's a bad deal, what do we do if we need to, let's say, have access to a database connection pool in many different places within the application? We simply need something like this:

class Database {
  public static Database INSTANCE = new Database();
  private Database() {
    // create a connection pool
  }
  public java.sql.Connection connect() {
    // Get new connection from the pool
    // and return
  }
}

Later in at, say, the JAX-RS REST method, we need to retrieve something from the database:

@Path("/")
class Index {
  @GET
  public String text() {
    java.sql.Connection connection =
      Database.INSTANCE.connect();
    return new JdbcSession(connection)
      .sql("SELECT text FROM table")
      .fetch(new SingleOutcome(String.class))
  }
}

In case you're not familiar with JAX-RS, it's a simple MVC architecture, and this text() method is a "controller." Additionally, I'm using JdbcSession, a simple JDBC wrapper from jcabi-jdbc.

We need that Database.INSTANCE to be a singleton, right? We need it to be globally available so that any MVC controller can have direct access to it. Since we all understand and agree that a singleton is an evil thing, what do we replace it with?

A dependency injection is the answer.

We need to make this database connection pool dependency of the controller and ensure it's provided through a constructor. However, in this particular case, for JAX-RS, we can't do it through a constructor thanks to its ugly architecture. But we can create a ServletContextListener, instantiate a Database in its contextInitialized() method, and add that instance as an attribute of servletContext. Then, inside the controller, we retrieve the servlet context by adding the javax.ws.rs.core.Context annotation to a setter and using getAttribute() on it. This is absolutely terrible and procedural, but it's better than a singleton.

A proper object-oriented design would pass an instance of Database to all objects that may need it through their constructors.

Nonetheless, what do we do if there are many dependencies? Do we make a 10-argument constructor? No, we don't. If our objects really need 10 dependencies to do their work, we need to break them down into smaller ones.

That's it. Forget about singletons; never use them. Turn them into dependencies and pass them from object to object through the operator new.

© Yegor Bugayenko 2014–2018

How to Hire a Programmer

QR code

How to Hire a Programmer

  • Los Angeles, CA
  • comments

I get asked this question very often: Where and how do you find and hire a good programmer? Since I'm a programmer and I manage software projects, I'm supposed to know the tricks. I do, of course; there are many of them, but the list below succinctly summarizes the most important ones.

Don't Be a Menace to South Central While Drinking Your Juice in the Hood (1996) by Paris Barclay
Don't Be a Menace to South Central While Drinking Your Juice in the Hood (1996) by Paris Barclay

I'll be referring to "him," but these recommendations apply equally to both female and male slaves software developers.

Ask a Friend. The best way to find talent is through a reference. Who knows the software market better than your high school classmate who bought a WordPress website last year, right? He will definitely recommend a good programmer to you. Recommendations are the most effective way of finding contractors. First, you don't need to worry about screening and testing. Second, you'll have a very good explanation for why your project failed---because your friend let you down with a bad recommendation. Win-win.

Hire Only Locals. Don't even think about a remote programmer---remote projects always fail. Always. He will work in a different time zone, you will always have cultural clashes, and his Russian accent will be annoying. You simply won't be able to meet him every second day and whine about your project being too expensive, too slow, and too frustrating. Hire only locals---they are much easier to manage and punish.

Don't Offend With a Lack of Trust. A talented professional programmer will be offended if you ask him to pass a test or prove some of his skills. That will demonstrate that you don't trust him. And if you don't trust him, you simply should not work together. Trust is the most important thing in any project. Also, don't ask how certain things will be done. He is the professional you're hiring, and he knows what he is doing. That's enough.

Fall in Love. After trust, the most important thing is a personal connection between you two. I'm not saying you must fall in love with your programmer, but it won't hurt. You should feel an emotional touch with him. Otherwise Java code won't work like you need it to. If you can't fall in love, you should at least become good friends. Invite him to your home parties, go to movies together, and introduce him to your wife. All of this will seriously affect the quality of the product he is creating.

Don't Specify Too Much. That's what Agile recommends, and I second that---face-to-face communication is more valuable than documentation. Don't write any documentation, don't specify what exactly you need to develop, and don't think too much about your "user stories." It's all in the past. Modern software engineers figure everything out by themselves. Just let him be creative and communicative. If something isn't clear, just call him. Remember, a Skype call is always better than those boring documents that nobody knows how to write.

Motivate by Value. In order to create a great software product, he must be very excited about it. Make sure he is excited. If he is not excited, call him again. Motivate him. Talk to him. Explain to him your brilliant Google-killer business idea again and again. Until the moment he screams says "I'm excited." He must know what a great value your product is producing for the entirety of civilization. And he must be excited. Do I have to say it one more time? Excited! Are you excited already? I'm excited.

Promise Job Security. Even if you just raised $2K for your startup from your wife's step-dad, promise your programmer a cloudless financial future. He must know that you've got enough to pay him until he retires. A good programmer must want to work with you forever. That's the type of programmer you need. You don't want one of those greedy freelancers who always jumps from project to project. You need a long-term commitment. That's why you have to pretend you're rich enough.

Delay Money Talks. Don't mention money for as long as you can. Ideally, ask him to create a prototype first and "then we'll discuss your salary." A good programmer doesn't work for money. He works for satisfaction. That's who you need. You should discuss value, excitement, features, market disruption, and anything else that's important, but not money. Programmers in general are not really good in financial negotiations. Use that to exploit him for as much as you can.

Don't Negotiate. Eventually you will have to discuss money. Make sure there will be no negotiation involved. It'll be an offensive process, and most programmers are very sensitive. Just tell him how much you will pay, and if he doesn't feel that is enough, get back to the value/excitement/market conversation. Do it again and again until he agrees.

Require Full Commitment. Make sure he will be fully committed to the project. Ideally, he must not have any other projects or even any personal life at all. He must promise to be 100 percent with you and your idea. If he is planning on doing something else, demonstrate that it will offend you. Act jealous, like a loving wife. You don't need a programmer who is interested in something else.

Make Him a Partner. First of all, making him a partner will save you a lot of money. Ideally, you should convince him to work for free. Good programmers are good entrepreneurs and like to take risks. A good programmer knows that in order to become the next Mark Zuckerberg, he must start at a job with no salary. Give him some equity and keep those motivational speeches coming. It's a perfect money-saving technique.

Be Positive. Simply don't tell him about your risks and concerns. The future of your project is bright, and he doesn't need to know more. You're going to be his leader, and a good leader is never too honest with subordinates. Always be positive about the plan---he needs to get that from you.


That should be enough to find and hire a good programmer. Interestingly enough, I just re-read this text one last time and it looks to me like a "how to find a wife" tutorial. Don't you think? Anyway, did I forget anything? Don't hesitate to post some extra wisdom below in the comments section.

© Yegor Bugayenko 2014–2018

Don't Use Java Assertions

QR code

Don't Use Java Assertions

  • Los Angeles, CA
  • comments

There are basically two ways to validate a situation in Java and complain when something unexpected happens. It's either an exception or an assertion. Technically, they are almost the same, but there are some small differences. I believe that exceptions are the right way to go in all situations and assertions should never be used. Here's why.

Natural Born Killers (1994) by Oliver Stone
Natural Born Killers (1994) by Oliver Stone

Let's see what happens when an assertion is triggered. Say that this is our code:

public class Main {
  public static void main(String... args) {
    assert true == false : "There is a problem";
    System.out.println("Hello, world!");
  }
}

Save this code to Main.java and compile:

$ javac Main.java

Then run it:

$ java Main
Hello, world!

The assertion wasn't triggered. It was ignored. Now run it with an -enableassertions flag:

$ java -enableassertions Main
Exception in thread "main" java.lang.AssertionError: There is a problem
  at Main.main(Main.java:3)

This is the first difference between exceptions and assertions. Exceptions will always be thrown, while assertions are not enabled by default. They are supposed to be turned on during testing and turned off in production. The assertion caused the runtime exception AssertionError. But hold on; it's not a RuntimeException. It extends Error class, which extends Throwable. This is the second difference. I don't know of any other differences.

I would recommend not to use assertions ... ever. Simply because I strongly believe in the Fail Fast approach. I think bugs must be visible not only during testing but also in production. Moreover, I believe making bugs visible in production is very important if you want to achieve a high-quality product.

Thus, no assertions. They are simply a flawed and outdated feature in Java (and some other languages).

© Yegor Bugayenko 2014–2018

11 Mistakes Conferences Keep Making

QR code

11 Mistakes Conferences Keep Making

  • Milan, Italy
  • comments

I was talking yesterday with a few friends who were software conference organizers. They were asking about my opinion of the conferences I've recently attended. Basically, they were interested to know what I would suggest for improvement. So, I decided to summarize it in a list of the most typical mistakes all conferences keep making and give them some ideas. Remember, I'm judging from a speaker's position. The most serious mistakes and pieces of advice are at the bottom.

Mi gran noche (2015) by Álex de la Iglesia
Mi gran noche (2015) by Álex de la Iglesia

Too formal. Very often all I'm getting from event organizers is a formal email that my presentation was accepted and then some travel details: "your hotel is here, don't miss this speaker's dinner." That's it. I don't know who is behind the conference, what this event is for, etc. At that speaker's dinner (if I don't miss it) some details are cleared up. At some conferences, they even communicate with me via some online form and I have to answer there through login/logout. This is rather annoying and turns me off. It would be much better to feel that there is a team with real people, who want to see me as a speaker. I would be way more dedicated and motivated.

Short breaks. Breaks between presentations that are too short give listeners an impression that each particular presentation is not as important as the event as a whole. This really de-motivates me as a speaker. I want to feel valuable. I came to the conference to be heard. I think my speech is the most important and I don't want people to jump quickly out of the room, just to have enough time to drink coffee and run to the next presentation. 30 minutes between talks is a comfortable time---enough to forget the previous talk and focus on the next one. Also, four talks per day is an absolute maximum for a serious listener.

No moderator. There must be someone on the stage who presents the speaker and asks questions if the room is quiet. All academic conferences have that. I don't understand why most industry events don't do the same. A speaker must feel comfortable on the stage. Being there alone doesn't help at all. Also, the moderator will promote me to better engage with the audience.

Boring website. The higher the level of the conference, the more modern and beautiful their website has to be. It's not only your but also my face. I'm not only attending your event, but also associating my name with it. I will share your website on my Twitter, my Facebook, my blog, etc. That's why I really want it to look cool. Some conferences don't care about that and this attitude seriously demotivates me as a speaker.

Cheap venue. This means literally what it says---the place is too cheap and, because of that, bad. The impression of such a place ruins the entire conference. Don't fool yourself, if it's cheap---it's bad. Into this category I would also place venues that are not designed to be conference hosts, like cinemas or offices. They will also be cheaper than proper venues, but you will get what you pay for.

No introduction. I believe it's a job of conference organizers to promote me as a speaker in front of the audience. Not only on the stage, but also online, via Twitter, Facebook, etc. Most conferences don't do that at all. I come to the stage completely unannounced with almost zero interest from the room. They simply don't know who I am and why I'm here. The conference must promote us, the speakers, very pro-actively.

Slow WiFi. It's just very annoying.

No networking. Conference organizers know all speakers who are attending and most of the listeners. They are at the center of this networking event, but they almost never use that position effectively. They should help me network while I'm there by introducing me to those who I may be interested in. This won't be hard at all for them. Just tell me "hey, let me introduce you to this guy who is working and speaking about something similar to you." Just make 5-10 such short intros and I will busy talking to these guys for the whole day. Instead, in most conferences, speakers arrive, speak, drink their coffee and leave. That's sad.

No video. Most conferences don't record videos and it's a terrible mistake. They probably think that their events are more important then us, their speakers. It's not true. I'm speaking in front of 50 people, video recording it on my iPhone, publishing it in my YouTube channel and getting 500 views in the first week. So, who is more important, your event which was able to attract just 50 people or my YouTube channel that attracted 500? Thus, I would strongly recommend paying a lot of attention to video recording and its quality. Preferably, you should record from two cameras (the speaker and the room) and do video editing afterwards. It's expensive, but very important.

Bad Equipment. I mean sound, light, projectors, screens, microphones, etc. It's very annoying to see how speakers and the audience suffer from all that technical problems.

Too small. It's very difficult to present when there is less than 100 people in the room. According to my experience, only one out of 10-15 people is actually listening and understanding. It's an average number. Of course, it depends on the subject, but not too much. This means that if there are 30 people in the room, only a few of them are my active listeners. It's very difficult to present to just a few people. You simply can't afford to lose their attention, even for a second. With a hundred people, the situation becomes more manageable. There are 8-12 people who are actively listening. Even if I lose a few of them, it's not a big deal. The best size of the room is 250 people. That's the ideal audience for a technical talk.

Too big. A very big audience (over 500 people in the room) causes another problem---I completely lose any contact with my listeners. I don't see their eyes any more. I can't see their reaction, I don't understand whether they follow me or not. I'm not a technical speaker any more, but a rock singer. All I can do is turn my presentation into a show and play with their emotions, not their brains. This is actually what most keynote speakers are usually doing.

Too many tracks. Honestly, I think that the very idea of having multiple tracks is very bad. I want the entire conference to attend my talk. I don't want to lose anyone. Moreover, I don't want to compete with some clown (no offense to clowns) just because he is speaking at the same time. It's a very demotivating competition, since I can't do anything to win. I want to feel that I'm speaking here because there is some value in my information. I want to know that I'm the chosen one. And I want to feel that conference attendees feel the same. A perfect conference must have one track, a room for 250-300 listeners, and a very well selected list of speakers. I believe that most conferences are shooting themselves in the foot trying to please the audience by inviting too many speakers. Instead, do your homework right---select the best speakers and don't make listeners walk from track to track cursing you for the mediocre speeches. Choosing speakers is your job! Don't delegate it to your audience.


Did I forget anything? Please, post below in the comments!

© Yegor Bugayenko 2014–2018

Who Is a Project Manager?

QR code

Who Is a Project Manager?

  • Milan, Italy
  • comments

A project manager is very often confused with a leader. However, they are two very different things. A project manager is the one who predicts the future, while a leader is the one who builds it. And, in my opinion, a perfect project manager is much more valuable for a project than a leader. If a leader is valuable at all...

Schindler's List (1993) by Steven Spielberg
Schindler's List (1993) by Steven Spielberg

There are three things I want to define: project, project management, and project manager. Once they are clear, my previous statement will become obvious.

A project is a vector from W1 to W2, where Wt is a set of all resources and risks in the world at some defined point in time t. Wt is the world, at the moment t. A project transforms the world, moving it from one state to another. PMBOK defines projects as "temporary endeavors undertaken to create unique products, services, or results," which is just a specific case of my definition. Mine is more abstract, I believe.

Consider this example. You woke up in the morning and made yourself a cup of coffee. That was a project. When you woke up, the world was in W1 state. There were some coffee beans in the bag, some water in the tap and some electricity in the power station. And there were you standing in front of the coffee machine. These were the resources (including yourself). There were also risks. The electricity black out could have happened, right? The machine could have broken, right? In theory, there was an unlimited amount of risks, including a zombie riot. However, the majority of them had very low probabilities, that's why you managed to make that cup of coffee.

When the coffee was ready, the world appeared to be in W2 state. There were no coffee beans in the bag any more, the water was used, and so was the electricity. However, a cup of coffee was created. We may call that project a success, but that's not really important and is not correct. What's important is that it's finished. We successfully transformed the world from state W1 to state W2. You may be surprised to hear that the project was not a success. Indeed, it was not. It was a success only for you, one of its stakeholders. How about your roommate, the owner of that bag of coffee beans, who asked you yesterday not to use them because he was waiting for a date tonight? How much of a success was your project to him?

So, a project is never a "success" or a "failure." A project is either dead or alive, that's it. Success is a subjective category and can only be measured per stakeholder. And even a small project has many stakeholders. Think about that electricity company who sold you a few kW/h and made some profit out of it? The project was definitely a success for them. What about mother nature? Your project was definitely a failure for it, since you produced a few kilograms of CO2 while making that damn coffee. As you see, success is very subjective.

And we're in line with the PMBOK definition. Our coffee making project was indeed a temporary endeavor undertaken to create a unique product, which was a cup of coffee.

Did we have a project manager? No. Were we doing any project management? No. Well, not explicitly. Obviously, you were the project manager, but you didn't realize that.

Project management is a set of tools to predict the outcome of a project. Planning is one of those tools. Guessing is another one. Expert judgment is yet another one, which you were using while making that coffee. You were an expert and knew how to use the machine, the electricity, and the water tap. You didn't need any other tools except your expert judgment. And it worked. In bigger projects, we would need more powerful instruments and methods. For example, we could use some scheduling software to plan when to put the beans into the machine, when to put that cup under the dipping point and when to press the button. You might also need a budgeting software to calculate how much money you will owe to the roommate. You might use a few risk identification and planning algorithms, etc.

Most of such tools are mentioned and explained in the PMBOK. They are even grouped there into so called "knowledge areas:" for predicting time, money, risks, people, etc. It's not important how exactly you predict the future, how many tools you're using or what knowledge areas you break them into. What's important is that you must try to do it with as much accuracy and precision as possible. Here comes the definition of the main guy.

A project manager (PM) is the one who predicts the future. The PM knows in what state W2 the world will be when the project is finished. If the PM doesn't know or is in doubt---it's a bad PM. If the PM knows and is certain about it---it's a good PM. That's it.

And I have to say, in that coffee making project you were a lousy PM. Did you know what was the probability of the project being finished without a cup of coffee made? A good PM would say that "after an analysis of 230 risks I predict the probability of that coffee being tasteful as 87.4%." Obviously, you didn't have that information. Next, did you know what would be the total monetary value of the project after its completion? Did you calculate all incurred costs, including the price of environmental damage your coffee machine made? A good PM would say that "the total cost of the project is expected to be $1.09." Were you able to predict the duration of the project precisely? Well, maybe that one you were rather good at.

There is only one reason why we want to put a project manager on top of the project. I'm sure you will be surprised to hear it: the only purpose of a PM in a project is to help its key stakeholders (also known as sponsors) to make a decision: to kill the project right now or to let it stay alive for a bit more. That's it.

You didn't need a PM in your coffee making project because you, as its key stakeholder, were fully committed to finish it only when the coffee is ready. But imagine another situation. The coffee machine suddenly breaks, the water stops, the electricity is blacked out and some zombies are knocking at your door. And you still want that cup of coffee. Well, you're not entirely sure what's more important now, the coffee or simply finding a way to survive. You will need a more or less accurate prediction of how much that coffee will cost you and when will it be ready. If it's just a few minutes and everything will be fine again, you will keep waiting for it. However, if the prediction is five hours and a risk of failure rate is 93%, you had better terminate this project and do something else.

That's exactly what is happening in software development projects and all other projects. Project sponsors need to know whether the project is worth going forward or it's time to stop it and do something else. That's what they hire project managers for. This is the only reason of that millions of PMs existence---to predict the future so that we were able to kill our projects before they kill us (read "eat all our resources").

You may ask---what about the coordinating part? What about morning stand-ups? What about walking around the office and motivating all the office slaves so that they don't get lazy? Isn't it the primary responsibility of a PM?

Not really. This is what a PM does in order to better understand the situation and predict the future. But it's not what a PM is paid for. Indeed, a bad PM goes around the office and calls multiple meetings a day. This is also known as "staying on top of things"---a perfect term to define an amateur PM. A bad PM becomes the future, instead of predicting it. He micromanages the team by telling everybody what to do, since this is the easiest way to know what will happen and when, in the short-term. But the long-term future stays absolutely unclear. A bad PM mostly relies on expert judgment, just like you did while making that coffee.

A good project manager is a completely different creature. A good PM finds a way to organize resources in such a way that their future becomes predictable. The key word here is "organize." A good PM organizes people, money, time, risks, stakeholders, and many other things. He uses planning and budgeting software in order to better see the future. But he doesn't become the future and he doesn't build the future. His people do that, he just observes. He only collects information from many possible sources and estimates what will happen, how much it will cost and who will suffer most and least. At any moment in time, he knows exactly when the project will be finished, how much it will spend, how many results it will produce, what the quality will be, and what the accuracy of that prediction is.

A good PM doesn't personally give orders to the team and doesn't meet people to tell them what to do. Instead, he makes sure that all communication is happening through a project management information system (PMIS). Moreover, in a perfectly organized project, a PM won't even need to give any orders to the team. Work orders will be created, approved, assigned and verified by the team itself. The PM will make sure that the workflow is seamless and disciplined. But he won't be personally responsible for telling people what to do.

A perfect PM won't even be visible to the team. Everything will be obvious and clear: plans will be available, work orders explicitly defined, risks identified and documented, concerns properly reported, stakeholders informed in time, etc. This may sound like utopia, but that's the true meaning of a "project manager" role.

I believe it's already obvious that project management has very little to do with leadership. They are just two orthogonal skill sets. I would say that a perfect PM won't even need any leadership skills, while a lousy PM will need a lot of them. As far as I understand, being a leader means having enough inner power (also known as "charisma") to make people do what you need. But that's totally against what we just discussed. A project manager doesn't want people do what's needed because of his charisma. Instead, he wants people to be leaders of their own tasks. They have to move forward driven by their own motivation and selfish interests, according to the plans and rules defined by the PM. A charismatic project manager will inevitably replace the rules by his or her own personality and the entire idea of project management will be ruined.

This is my understanding of project management.

© Yegor Bugayenko 2014–2018

What to Worry About in Convertible Notes

QR code

What to Worry About in Convertible Notes

  • Palo Alto, CA
  • comments

"Convertible Notes" is what you most likely will hear the first time you get money for your first startup. They will give you cash asking to give them the convertible notes (or SAFE, which is very similar). Convertible notes are just a few pages of paper with two signatures at the bottom. Not too much to worry about. It's basically a contract between your startup and an investor. Let's see what exactly it says and what you, as a founder, should pay attention to.

The Godfather: Part III (1990) by Francis Ford Coppola
The Godfather: Part III (1990) by Francis Ford Coppola

Why Not Equity?

The first question is why convertible notes? Why not just shares of stock? And what the hell are "shares of stock" in the first place, right? Basically, there are two questions in each business or in any group activity, be it a new mobile app, a multi-national corporation or a bank robbery: 1) who is the boss, and 2) who gets the profit. To regulate that process "shares of stock" were invented (if you know who invented them and when, let me know).

Say we're planning to rob a bank create a Facebook killer. There are three of us. We print three papers, each of which says: "whoever holds this paper has one vote and will get an equal part of the profit." How does that sound? Each of us has the same paper. When it's time to decide whether we use Java or PHP, we sit together, show our papers and vote. One vote for Java, two for PHP---the decision is made, we will use PHP. When our startup finally dies and it's time to decide what to do with the domain name, we sell it for $300 and give $100 to each holder of that paper, since there are just three papers and they have equal rights.

Thus, basically, each share of stock (this is an official name of that piece of paper) is a promise. A promise of some rights to vote and to make profit. The company (our startup) is making us a promise.

By the way, I can sell my share of stock to my friend. When it's time to decide whether it's Java or PHP, he will show up and vote. You may not like that, since you are seeing this dude for the first time, but you will have to obey---he's got that paper in his hands. That's why shares of stock are also called equity. I can sell them just like I can sell my car. No matter who owns them, he or she has exactly the same rights as the original or the previous owner. They are assets.

Usually, there are millions or billions of shares of stock. When a company starts, it prints, say, a million of them, giving 200,000 to each co-founder and leaving 400,000 in the so-called "pool." Later, an investor shows up and says: "I will put $500,000 to the bank account of the company and the company will print 300,000 more shares of stock for me." The amount of shares "issued" is growing. For example, at the time of writing there are 7.91 billion shares with Microsoft name of them. Microsoft Corporation has been printing extra shares nine times after their IPO in 1986. When Bill Gates founded the company in April 1975, he had 500K shares, which were equal to 50% (I'm guessing, do you know exact numbers?). Now he holds nearly 223M, which is just 2.8% of the total.

Now, the most annoying part. In reality, shares are not just pieces of paper with a few sentences on them, like in our example above. They are big legal documents that explain exactly how their holder can vote and exactly when and how he or she will get the profit. There are tons of legal clauses, which usually take weeks or months of discussions, between the company and investors. In reality, an investor says: "I will put $500,000 to the bank account of the company and the company will print 300,000 more shares of stock for me, terms and conditions of which my lawyer will discuss with you."

If we're talking about $500K, you will have no problem meeting that lawyers. However, if it's just $25K... To make life easier for smaller investments, convertible notes were invented (well, there were a few other reasons). They are not equity. Investors that have convertible notes can't vote. They can't sell convertible notes and they can't get any profit from the company. So, what they are for then? I'll explain in a minute. My goal so far was to show why young companies don't want to deal with shares of stock---because of greedy lawyers and, of course, the complexity of terms and conditions.

What are Convertible Notes?

They are just debts. They are not real investments. The company simply borrows money from an investor, promising to return them back. Why not just call them "money borrowing notes?" Because investors don't want their money back. They want equity.

So here is how it works. Say I'm an investor, giving you $25K. You give me convertible notes. Then we wait. We wait until a more serious investor shows up and gives you a bigger sum of money. And it's not just a matter of amount. What's important is that this investor must get shares of stock from you. This will be called "equity financing." You get finance and give away equity. When this happens, I show up, give you the convertible notes and you give me equity. On the same terms as you gave to that investor. I won't send you my lawyers, you won't discuss terms and conditions. You will just convert my notes to equity, on the same terms as agreed with that investor. Plain and simple.

A practical example. There is you and your co-founder. You guys have 1,000,000 shares of stock, 500K issued to each of you. I give you $25K, you give me convertible notes. In a few months, an investor comes in and your company issues 100,000 shares and sells them for $400,000 (your post a $400K check to the bank account of your company). This means that now there are 1,100,000 shares in total. You just sold 100K of them with the price of $4 per share. Now it's time to convert my convertible notes. You will have to give me 6,250 shares and I'll return you the notes. Thus, in the end, there will be 1,106,250 shares total and your company's post-money valuation will be $4,425,000. Got the math?

My shares will have exactly the same "rights, privileges, preferences and restrictions" as the shares you gave to the investor. And I won't have an option to negotiate. I will just receive them and accept.

One more thing. If that investor will never show up, you still owe me $25K. A debt is a debt.

Now, since we know what convertible notes are for and how they work, let's see what is important to pay attention to. There are just a few things, but they are really important.

The Valuation Cap

Let's take a look again at the example above. You are selling 100,000 shares for $4. This technically means that the shares the two of you had, before the investor showed up, suddenly got some value, right? They were just papers, but now someone is ready to pay $4 for each of them.

This means that each of you, being a holder of 500K shares, owns equity for $2,000,000 (I'm just multiplying 500K by $4). Also, this means that the valuation of the company is $4M. I'm just multiplying the total amount of shares, which is a million, by the price of each share. This valuation is also called pre-money valuation (the valuation before that $400K landed at your bank account).

There is also a post-money valuation, which, as you can imagine, is calculated by multiplying total amount of shares after the investment, by their price. In this case, it's $4.4M (1,100,000 by $4).

Let's see what happened in our example with my $25K. I gave them to you when your company was very young. Your valuation was rather low, because you barely had any results. You needed small cash to pay your bills and fill your car with gas. The valuation was definitely lower than $4M. So why are you converting my notes as if at the time of my investment the valuation was already that high. It's not fair. I want to get more than 6,250 shares. I want my part to be calculated as if your valuation was, say, $500K. In that case, I will get 20,000 shares. That's fair. The investor will pay $400K to get 100K shares, but I paid just $25K to get 20K of them. I earned more equity, because my risk was way higher.

To make that math happen, we put a "valuation cap" into the convertible notes. There will be a clause that guarantees that no matter what will be that pre-money valuation at the moment of "equity financing," in my formula it will stay $500K.

Obviously, for you as a founder, an ideal situation would be to have "no cap" convertible notes. That's the first thing you should try to insist on: no cap! Most investors will smile back and disagree. It's only logical. Then, try to negotiate the value of the cap. Try to make it as big as possible.

But remember, it's better to have money and a small cap than a big cap and no money. Does it sound too obvious?

The Discount

Here is the same problem, but a different instrument. Again, as an investor, I don't like that you're selling me shares for $4. This is the price you are giving to the investor who came way later than myself. Their risks are way lower. I want a discount!

We can put a clause into convertible notes, which will say that the price for me will be same as for the investor at the moment of "equity financing," minus, say, a 50% discount.

Again, as a founder, you should insist on "no discount" convertible notes. Will I agree? Probably not. Especially if there is no cap. Try to negotiate a smaller discount. Maybe 10%, just to give me a feeling of appreciation.

The Interest

Remember that by signing convertible notes and sending you cash, investors are basically lending you money. You owe that $25K to them. And some of them will ask for an interest. And the interest may be payable annually. Say, 5% per year. That means that you will have to send them a check for $1,250 every year, no matter how your startup is doing.

It's only logical for them, but is totally against you. Do not agree to pay any interest.

The Maturity Date

Some investors are ready to wait until that "equity financing" moment for as much as necessary. Others may demand you to pay them back on a so called "maturity date." Pay cash, with the interest. This date will usually be somewhere far ahead, like "three years from now." But don't feel too relaxed, this day will come faster than you expect.

Try not to give convertible notes with a maturity date to anyone.

SAFE is a form of convertible notes, introduced by YC, which doesn't have a maturity date at all. This technically means that you don't have to pay them anything back. Well, there is only one situation when you have to pay---in case your startup dies. In that case, you will have to pay investors as much as you can, using the cash you still have in your bank account. Most likely there will be nothing, so don't worry.


There are other less usual or less important elements of convertible notes, which you most likely won't ever see or should not worry about, like pro-data rights, for example. Just focus on the things listed above and you will be good.

© Yegor Bugayenko 2014–2018

Keynote Clowns

QR code

Keynote Clowns

  • Kiev, Ukraine
  • comments

Over the last six months, I've attended 18 conferences and heard over 30 keynote sessions, mostly about software development and management. I think I now know all the secrets of a successful keynote speaker. It doesn't look so difficult to become one. Here are my thoughts.

Bean (1997) by Mel Smith
Bean (1997) by Mel Smith

Be obvious! Don't take a chance by suggesting something new. It's risky and some people may disagree with you. That's not good. The goal is to have everybody in the room completely agree with what you're saying. That's how you make a good speech. The audience will be comfortable and relaxed, and you will have no risk of being questioned afterwards. A few safe headline examples: "trust is very important" or "software must be stable." Everybody will be nodding their heads---that's all you need.

Joke! You must make them laugh. You must open with a joke and continue with many of them. Prepare them carefully. Just Google "good keynote jokes" and use what smart people recommend. A well-prepared collection of jokes is much more valuable than the content you will be talking about. Nobody will remember the content, but the jokes will definitely be re-tweeted. When a good speaker is talking, the room is laughing every 60 seconds.

Swear! Don't be too formal and boring, show a slide with a picture of a naked butt every once in a while. Everybody will understand that you're not only a speaker but also a good friend. Also, your language should be rather loose. Pretend you're talking to a friend over a pint of beer. Remember, the goal is to be funny.

Repeat! Always bring the same content with you, to all conferences. It's easier for everybody. First, conference organizers will know for sure what will you be talking about. They can even watch your 4-year-old video-recorded presentation and see exactly what words and slides you're planning to use. Second, you won't be nervous, since you'll be saying the same jokes over and over again. Everybody wins.

Kitties! Cute kitties. We all love them! Attach them somehow to your content. It is not really important whether they are related or not. You must show love. Instead of cats you can use a picture of your 2-y.o. daughter or yourself in a primary school. It has to be something sweet and adorable.

Keep talking! A good keynote speech fills the entire 60 minutes, leaving absolutely no time for questions. Actually, a perfect speaker will be interrupted after the 145th slide and will say that if anyone wants to know more, there is always a place near the restroom, let's go there and continue. Thus, be focused on your slides and try to avoid questions at the end---they may create a negative impression of you if you mess up answering their questions. They came to listen to you, not to ask questions---keep talking.


On a more serious note, I'm very disappointed by what I've seen in almost all conferences so far. These keynote speakers are in most cases just making money, delivering the same "fun" again and again. They make $2-3K a speech and we, the listeners, get absolutely nothing new out of them.

Conference organizers keep inviting them, just because of the names. And we keep attending that conferences also just because of the names. But do these names really mean anything? I don't think so. These guys are, in most cases, just retired losers with good presentation skills.

It would be much better to spend the money conferences waste on the big-names for training practical speakers from the trenches, with really fresh and interesting content. As far as I understand, conference organizers are just too lazy to do that. It's just easier to buy a "proven" clown.

It's sad.

© Yegor Bugayenko 2014–2018

Test Methods Must Share Nothing

QR code

Test Methods Must Share Nothing

  • Palo Alto, CA
  • comments

Constants... I wrote about them some time ago, mostly saying that they are a bad thing, if being public. They reduce duplication, but introduce coupling. A much better way to get rid of duplication is by creating new classes or methods---a traditional OOP method. This seems to make sense and in our projects I see less and less public constants. In some projects we don't have them at all. But one thing still bothers me: unit tests. Most programmers seem to think that when static analysis says that there are too many similar literals in the same file, the best way to get rid of them is via a private static literal. This is just wrong.

Oldeuboi (2003) by Chan-wook Park
Oldeuboi (2003) by Chan-wook Park

Unit tests, naturally, duplicate a lot of code. Test methods contain similar or almost identical functionality and this is almost inevitable. Well, we can use more of that @Before and @BeforeClass features, but sometimes it's just not possible. We may have, say, 20 test methods in one FooTest.java file. Preparing all objects in one "before" is not possible. So we have to do certain things again and again in our test methods.

Let's take a look at one of the classes in our Takes Framework: VerboseListTest. It's a unit test and it has a problem, which I'm trying to tell you about. Look at that MSG private literal. It is used for the first time in setUp() method as an argument of an object constructor and then in a few test methods to check how that object behaves. Let me simplify that code:

class FooTest {
  private static final String MSG = "something";
  @Before
  public final void setUp() throws Exception {
    this.foo = new Foo(FooTest.MSG);
  }
  @Test
  public void simplyWorks() throws IOException {
    assertThat(
      foo.doSomething(),
      containsString(FooTest.MSG)
    );
  }
  @Test
  public void simplyWorksAgain() throws IOException {
    assertThat(
      foo.doSomethingElse(),
      containsString(FooTest.MSG)
    );
  }
}

This is basically what is happening in VerboseListTest and it's very wrong. Why? Because this shared literal MSG introduced an unnatural coupling between these two test methods. They have nothing in common, because they test different behaviors of class Foo. But this private constant ties them together. Now they are somehow related.

If and when I want to modify one of the test methods, I may need to modify the other one too. Say I want to see how doSomethingElse() behaves if the encapsulated message is an empty string. What do I do? I change the value of the constant FooTest.MSG, which is used by another test method. This is called coupling. And it's a bad thing.

What do we do? Well, we can use that "something" string literal in both test methods:

class FooTest {
  @Test
  public void simplyWorks() throws IOException {
    assertThat(
      new Foo("something").doSomething(),
      containsString("something")
    );
  }
  @Test
  public void simplyWorksAgain() throws IOException {
    assertThat(
      new Foo("something").doSomethingElse(),
      containsString("something")
    );
  }
}

As you see, I got rid of that setUp() method and the private static literal MSG. What do we have now? Code duplication. String "something" shows up four times in the test class. No static analyzers will tolerate that. Moreover, there are seven (!) test methods in VerboseListTest, which are using MSG. Thus, we will have 14 occurrences of "something", right? Yes, that's right and that's most likely why one of authors of this test case introduced the constant---to get rid of duplication. BTW, @Happy-Neko did that in pull request #513, @carlosmiranda reviewed the code and I approved the changes. So, three people made/approved that mistake, including myself.

So what is the right approach that will avoid code duplication and at the same time won't introduce coupling? Here it is:

class FooTest {
  @Test
  public void simplyWorks() throws IOException {
    final String msg = "something";
    assertThat(
      new Foo(msg).doSomething(),
      containsString(msg)
    );
  }
  @Test
  public void simplyWorksAgain() throws IOException {
    final String msg = "something else";
    assertThat(
      new Foo(msg).doSomethingElse(),
      containsString(msg)
    );
  }
}

These literals must be different. This is what any static analyzer is saying when it sees "something" in so many places. It questions us---why are they the same? Is it really so important to use "something" everywhere? Why can't you use different literals? Of course we can. And we should.

The bottom line is that each test method must have its own set of data and objects. They must not be shared between test methods ever. Test methods must always be independent, having nothing in common.

Having that in mind, we can easily conclude that methods like setUp() or any shared variables in test classes are evil. They must not be used and simply must not exist. I think that their invention in JUnit caused a lot of harm to Java code.

© Yegor Bugayenko 2014–2018

Why InputStream Design Is Wrong

QR code

Why InputStream Design Is Wrong

  • Washington, D.C.
  • comments

It's not just about InputSteam, this class is a good example of a bad design. I'm talking about three overloaded methods read(). I've mentioned this problem in Section 2.9 of Elegant Objects. In a few words, I strongly believe that interfaces must be "functionality poor." InputStream should have been an interface in the first place and it should have had a single method read(byte[]). Then if its authors wanted to give us extra functionality, they should have created supplementary "smart" classes.

A Serious Man (2009) by Coen Brothers
A Serious Man (2009) by Coen Brothers

This is how it looks now:

abstract class InputStream {
  int read();
  int read(byte[] buffer, int offset, int length);
  int read(byte[] buffer);
}

What's wrong? It's very convenient to have the ability to read a single byte, an array of bytes or even an array of bytes with a direct positioning into a specific place in the buffer!

However, we are still lacking a few methods: for reading the bytes and immediately saving into a file, converting to a text with a selected encoding, sending them by email and posting on Twitter. It would be great to have the features too, right in the poor InputStream. I hope the Oracle Java team is working on them now.

In the mean time, let's see what exactly is wrong with what these bright engineers designed for us already. Or maybe let me show how I would design InputStream and we'll compare:

interface InputStream {
  int read(byte[] buffer, int offset, int length);
}

This is my design. The InputStream is responsible for reading bytes from the stream. There is one single method for this feature. Is it convenient for everybody? Does it read and post on Twitter? Not yet. Do we need that functionality? Of course we do, but it doesn't mean that we will add it to the interface. Instead, we will create supplementary "smart" class:

interface InputStream {
  int read(byte[] buffer, int offset, int length);
  class Smart {
    private final InputStream origin;
    public Smart(InputStream stream) {
      this.origin = stream;
    }
    public int read() {
      final byte[] buffer = new byte[1];
      final int read = this.origin.read(buffer, 0, 1);
      final int result;
      if (read < 1) {
        result = -1;
      } else {
        result = buffer[0];
      }
      return result;
    }
  }
}

Now, we want to read a single byte from the stream. Here is how:

final InputStream input = new FileInputStream("/tmp/a.txt");
final byte b = new InputStream.Smart(input).read();

The functionality of reading a single byte is outside of InputStream, because this is not its business. The stream doesn't need to know how to manage the data after it is read. All the stream is responsible for is reading, not parsing or manipulating afterwards.

Interfaces must be small.

Obviously, method overloading in interfaces is a code smell. An interface with more than three methods is a good candidate for refactoring. If methods overload each other---it's serious trouble.

Interfaces must be small!

You may say that the creators of InputStream cared about performance, that's why allowed us to implement read() in three different forms. Then I have to ask again, why not create a method for reading and immediately post it on Twitter? That would be fantastically fast. Isn't it what we all want? A fast software which nobody has any desire to read or maintain.

© Yegor Bugayenko 2014–2018

Object Behavior Must Not Be Configurable

QR code

Object Behavior Must Not Be Configurable

  • New York, NY
  • comments

Using object properties as configuration parameters is a very common mistake we keep making mostly because our objects are mutable---we configure them. We change their behavior by injecting parameters or even entire settings/configuration objects into them. Do I have to say that it's abusive and disrespectful from a philosophical point of view? I can, but let's take a look at it from a practical perspective.

The Take (2009) by David Drury
The Take (2009) by David Drury

Let's say there is a class that is supposed to read a web page and return its content:

class Page {
  private final String uri;
  Page(final String address) {
    this.uri = address;
  }
  public String html() throws IOException {
    return IOUtils.toString(
      new URL(this.uri).openStream(),
      "UTF-8"
    );
  }
}

Looks simple and straight-forward, right? Yes, it's a rather cohesive and solid class. Here is how we use it to read the content of Google front page:

String html = new Page("http://www.google.com").html();

Everything is fine until we start making this class more powerful. Let's say we want to configure the encoding. We don't always want to use "UTF-8". We want it to be configurable. Here is what we do:

class Page {
  private final String uri;
  private final String encoding;
  Page(final String address, final String enc) {
    this.uri = address;
    this.encoding = enc;
  }
  public String html() throws IOException {
    return IOUtils.toString(
      new URL(this.uri).openStream(),
      this.encoding
    );
  }
}

Done, the encoding is encapsulated and configurable. Now, let's say we want to change the behavior of the class for the situation of an empty page. If an empty page is loaded, we want to return "<html/>". But not always. We want this to be configurable. Here is what we do:

class Page {
  private final String uri;
  private final String encoding;
  private final boolean alwaysHtml;
  Page(final String address, final String enc,
    final boolean always) {
    this.uri = address;
    this.encoding = enc;
    this.alwaysHtml = always;
  }
  public String html() throws IOException {
    String html = IOUtils.toString(
      new URL(this.uri).openStream(),
      this.encoding
    );
    if (html.isEmpty() && this.alwaysHtml) {
      html = "<html/>";
    }
    return html;
  }
}

The class is getting bigger, huh? It's great, we're good programmers and our code must be complex, right? The more complex it is, the better programmers we are! I'm being sarcastic. Definitely not! But let's move on. Now we want our class to proceed anyway, even if the encoding is not supported on the current platform:

class Page {
  private final String uri;
  private final String encoding;
  private final boolean alwaysHtml;
  private final boolean encodeAnyway;
  Page(final String address, final String enc,
    final boolean always, final boolean encode) {
    this.uri = address;
    this.encoding = enc;
    this.alwaysHtml = always;
    this.encodeAnyway = encode;
  }
  public String html() throws IOException,
  UnsupportedEncodingException {
    final byte[] bytes = IOUtils.toByteArray(
      new URL(this.uri).openStream()
    );
    String html;
    try {
      html = new String(bytes, this.encoding);
    } catch (UnsupportedEncodingException ex) {
      if (!this.encodeAnyway) {
        throw ex;
      }
      html = new String(bytes, "UTF-8")
    }
    if (html.isEmpty() && this.alwaysHtml) {
      html = "<html/>";
    }
    return html;
  }
}

The class is growing and becoming more and more powerful! Now it's time to introduce a new class, which we will call PageSettings:

class Page {
  private final String uri;
  private final PageSettings settings;
  Page(final String address, final PageSettings stts) {
    this.uri = address;
    this.settings = stts;
  }
  public String html() throws IOException {
    final byte[] bytes = IOUtils.toByteArray(
      new URL(this.uri).openStream()
    );
    String html;
    try {
      html = new String(bytes, this.settings.getEncoding());
    } catch (UnsupportedEncodingException ex) {
      if (!this.settings.isEncodeAnyway()) {
        throw ex;
      }
      html = new String(bytes, "UTF-8")
    }
    if (html.isEmpty() && this.settings.isAlwaysHtml()) {
      html = "<html/>";
    }
    return html;
  }
}

Class PageSettings is basically a holder of parameters, without any behavior. It has getters, which give us access to the parameters: isEncodeAnyway(), isAlwaysHtml(), and getEncoding(). If we keep going in this direction, there could be a few dozen configuration settings in that class. This may look very convenient and is a very typical pattern in Java world. For example, look at JobConf from Hadoop. This is how we will call our highly configurable Page (I'm assuming PageSettings is immutable):

String html = new Page(
  "http://www.google.com",
  new PageSettings()
    .withEncoding("ISO_8859_1")
    .withAlwaysHtml(true)
    .withEncodeAnyway(false)
).html();

However, no matter how convenient it may look at first glance, this approach is very wrong. Mostly because it encourages us to make big and non-cohesive objects. They grow in size and become less testable, less maintainable and less readable.

To prevent that from happening, I would suggest a simple rule here: object behavior should not be configurable. Or, more technically, encapsulated properties must not be used to change the behavior of an object.

Object properties are there only to coordinate the location of a real-world entity, which the object is representing. The uri is the coordinate, while the alwaysHtml boolean property is a behavior changing trigger. See the difference?

So, what should we do instead? What is the right design? We must use composable decorators. Here is how:

Page page = new NeverEmptyPage(
  new DefaultPage("http://www.google.com")
)
String html = new AlwaysTextPage(
  new TextPage(page, "ISO_8859_1")
  page
).html();

Here is how our DefaultPage would look (yes, I had to change its design a bit):

class DefaultPage implements Page {
  private final String uri;
  DefaultPage(final String address) {
    this.uri = address;
  }
  @Override
  public byte[] html() throws IOException {
    return IOUtils.toByteArray(
      new URL(this.uri).openStream()
    );
  }
}

As you see, I'm making it implement interface Page. Now TextPage decorator, which converts an array of bytes to a text using provided encoding:

class TextPage {
  private final Page origin;
  private final String encoding;
  TextPage(final Page page, final String enc) {
    this.origin = page;
    this.encoding = enc;
  }
  public String html() throws IOException {
    return new String(
      this.origin.html(),
      this.encoding
    );
  }
}

Now the NeverEmptyPage:

class NeverEmptyPage implements Page {
  private final Page origin;
  NeverEmptyPage(final Page page) {
    this.origin = page;
  }
  @Override
  public byte[] html() throws IOException {
    byte[] bytes = this.origin.html();
    if (bytes.length == 0) {
      bytes = "<html/>".getBytes();
    }
    return bytes;
  }
}

And finally the AlwaysTextPage:

class AlwaysTextPage {
  private final TextPage origin;
  private final Page source;
  AlwaysTextPage(final TextPage page, final Page src) {
    this.origin = page;
    this.source = src;
  }
  public String html() throws IOException {
    String html;
    try {
      html = this.origin.html();
    } catch (UnsupportedEncodingException ex) {
      html = new TextPage(this.source, "UTF-8").html();
    }
    return html;
  }
}

You may say that AlwaysTextPage will make two calls to the encapsulated origin, in case of an unsupported encoding, which will lead to a duplicated HTTP request. That's true and this is by design. We don't want this duplicated HTTP roundtrip to happen. Let's introduce one more class, which will cache the page fetched ( not thread-safe, but it's not important now):

class OncePage implements Page {
  private final Page origin;
  private final AtomicReference<byte[]> cache =
    new AtomicReference<>;
  OncePage(final Page page) {
    this.origin = page;
  }
  @Override
  public byte[] html() throws IOException {
    if (this.cache.get() == null) {
      this.cache.set(this.origin.html());
    }
    return this.cache.get();
  }
}

Now, our code should look like this (pay attention, I'm now using OncePage):

Page page = new NeverEmptyPage(
  new OncePage(
    new DefaultPage("http://www.google.com")
  )
)
String html = new AlwaysTextPage(
  new TextPage(page, "ISO_8859_1")
  "UTF-8"
).html();

This is probably the most code-intensive post on this site so far, but I hope it's readable and I managed to convey the idea. Now we have five classes, each of which is rather small, easy to read and easy to reuse.

Just follow the rule: never make classes configurable!

© Yegor Bugayenko 2014–2018

Java Annotations Are a Big Mistake

QR code

Java Annotations Are a Big Mistake

  • Seattle, WA
  • comments

Annotations were introduced in Java 5, and we all got excited. Such a great instrument to make code shorter! No more Hibernate/Spring XML configuration files! Just annotations, right there in the code where we need them. No more marker interfaces, just a runtime-retained reflection-discoverable annotation! I was excited too. Moreover, I've made a few open source libraries which use annotations heavily. Take jcabi-aspects, for example. However, I'm not excited any more. Moreover, I believe that annotations are a big mistake in Java design.

Gomorra (2008) by Matteo Garrone)
Gomorra (2008) by Matteo Garrone)

Long story short, there is one big problem with annotations---they encourage us to implement object functionality outside of an object, which is against the very principle of encapsulation. The object is not solid any more, since its behavior is not defined entirely by its own methods---some of its functionality stays elsewhere. Why is it bad? Let's see in a few examples.

@Inject

Say we annotate a property with @Inject:

import javax.inject.Inject;
public class Books {
  @Inject
  private final DB db;
  // some methods here, which use this.db
}

Then we have an injector that knows what to inject:

Injector injector = Guice.createInjector(
  new AbstractModule() {
    @Override
    public void configure() {
      this.bind(DB.class).toInstance(
        new Postgres("jdbc:postgresql:5740/main")
      );
    }
  }
);

Now we're making an instance of class Books via the container:

Books books = injector.getInstance(Books.class);

The class Books has no idea how and who will inject an instance of class DB into it. This will happen behind the scenes and outside of its control. The injection will do it. It may look convenient, but this attitude causes a lot of damage to the entire code base. The control is lost (not inverted, but lost!). The object is not in charge any more. It can't be responsible for what's happening to it.

Instead, here is how this should be done:

class Books {
  private final DB db;
  Books(final DB base) {
    this.db = base;
  }
  // some methods here, which use this.db
}

This article explains why Dependency Injection containers are a wrong idea in the first place: Dependency Injection Containers are Code Polluters. Annotations basically provoke us to make the containers and use them. We move functionality outside of our objects and put it into containers, or somewhere else. That's because we don't want to duplicate the same code over and over again, right? That's correct, duplication is bad, but tearing an object apart is even worse. Way worse. The same is true about ORM (JPA/Hibernate), where annotations are being actively used. Check this post, it explains what is wrong about ORM: ORM Is an Offensive Anti-Pattern. Annotations by themselves are not the key motivator, but they help us and encourage us by tearing objects apart and keeping parts in different places. They are containers, sessions, managers, controllers, etc.

@XmlElement

This is how JAXB works, when you want to convert your POJO to XML. First, you attach the @XmlElement annotation to the getter:

import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement
public class Book {
  private final String title;
  public Book(final String title) {
    this.title = title;
  }
  @XmlElement
  public String getTitle() {
    return this.title;
  }
}

Then, you create a marshaller and ask it to convert an instance of class Book into XML:

final Book book = new Book("0132350882", "Clean Code");
final JAXBContext ctx = JAXBContext.newInstance(Book.class);
final Marshaller marshaller = ctx.createMarshaller();
marshaller.marshal(book, System.out);

Who is creating the XML? Not the book. Someone else, outside of the class Book. This is very wrong. Instead, this is how this should have been done. First, the class that has no idea about XML:

class DefaultBook implements Book {
  private final String title;
  DefaultBook(final String title) {
    this.title = title;
  }
  @Override
  public String getTitle() {
    return this.title;
  }
}

Then, the decorator that prints it to the XML:

class XmlBook implements Book{
  private final Book origin;
  XmlBook(final Book book) {
    this.origin = book;
  }
  @Override
  public String getTitle() {
    return this.origin.getTitle();
  }
  public String toXML() {
    return String.format(
      "<book><title>%s</title></book>",
      this.getTitle()
    );
  }
}

Now, in order to print the book in XML we do the following:

String xml = new XmlBook(
  new DefaultBook("Elegant Objects")
).toXML();

The XML printing functionality is inside XmlBook. If you don't like the decorator idea, you can move the toXML() method to the DefaultBook class. It's not important. What is important is that the functionality always stays where it belongs---inside the object. Only the object knows how to print itself to the XML. Nobody else!

@RetryOnFailure

Here is an example (from my own library):

import com.jcabi.aspects.RetryOnFailure;
class Foo {
  @RetryOnFailure
  public String load(URL url) {
    return url.openConnection().getContent();
  }
}

After compilation, we run a so called AOP weaver that technically turns our code into something like this:

class Foo {
  public String load(URL url) {
    while (true) {
      try {
        return _Foo.load(url);
      } catch (Exception ex) {
        // ignore it
      }
    }
  }
  class _Foo {
    public String load(URL url) {
      return url.openConnection().getContent();
    }
  }
}

I simplified the actual algorithm of retrying a method call on failure, but I'm sure you get the idea. AspectJ, the AOP engine, uses @RetryOnFailure annotation as a signal, informing us that the class has to be wrapped into another one. This is happening behind the scenes. We don't see that supplementary class, which implements the retrying algorithm. But the bytecode produced by the AspectJ weaver contains a modified version of class Foo.

That is exactly what is wrong with this approach---we don't see and don't control the instantiation of that supplementary object. Object composition, which is the most important process in object design, is hidden somewhere behind the scenes. You may say that we don't need to see it since it's supplementary. I disagree. We must see how our objects are composed. We may not care about how they work, but we must see the entire composition process.

A much better design would look like this (instead of annotations):

Foo foo = new FooThatRetries(new Foo());

And then, the implementation of FooThatRetries:

class FooThatRetries implements Foo {
  private final Foo origin;
  FooThatRetries(Foo foo) {
    this.origin = foo;
  }
  public String load(URL url) {
    return new Retry().eval(
      new Retry.Algorithm<String>() {
        @Override
        public String eval() {
          return FooThatRetries.this.load(url);
        }
      }
    );
  }
}

And now, the implementation of Retry:

class Retry {
  public <T> T eval(Retry.Algorithm<T> algo) {
    while (true) {
      try {
        return algo.eval();
      } catch (Exception ex) {
        // ignore it
      }
    }
  }
  interface Algorithm<T> {
    T eval();
  }
}

Is the code longer? Yes. Is it cleaner? A lot more. I regret that I didn't understand it two years ago, when I started to work with jcabi-aspects.


The bottom line is that annotations are bad. Don't use them. What should be used instead? Object composition.

What could be worse than annotations? Configurations. For example, XML configurations. Spring XML configuration mechanisms is a perfect example of terrible design. I've said it many times before. Let me repeat it again---Spring Framework is one of the worst software products in the Java world. If you can stay away from it, you will do yourself a big favor.

There should not be any "configurations" in OOP. We can't configure our objects if they are real objects. We can only instantiate them. And the best method of instantiation is operator new. This operator is the key instrument for an OOP developer. Taking it away from us and giving us "configuration mechanisms" is an unforgivable crime.

© Yegor Bugayenko 2014–2018

Growing Revenue May Kill Your Startup

QR code

Growing Revenue May Kill Your Startup

  • Palo Alto, CA
  • comments

Revenue means cash that is coming into your bank account every month from your customers. Not investors. Customers, those who are buying your products or services. You are doing everything you can to make sure this number grows, mostly because you use this money to pay your rent, buy food, and settle that graphic designer's invoices. Without revenue, your startup will die, right? Yes, maybe. But in my experience, growing revenue may kill it even faster.

Blow (2001) by Ted Demme
Blow (2001) by Ted Demme

I see this rather typical pattern in many startups we interview at SeedRamp. The idea is great, the prototype works, the first customers are on board, and the first payments are coming in. The founders are excited. They spend all their energy to make sure those first paying customers are happy by creating new features, fixing bugs, and employing new CRMs. They also try to acquire new buyers and pay for marketing, promotion, and Google AdWords. The numbers grow every month and ... they don't realize they are actually killing their own startup.

badge

If you're building a cafe, a bakery, a web development studio, or a bicycle repair shop, a growing customer base with stable revenue must be your main objective. Because your main source of income is the profit you will get from the business, in the form of dividends.

How rich you will become depends on many factors, but usually such a lifestyle business takes years to really take off and start making millions of dollars, which you're obviously looking for.

In that type of business, almost everything depends on your energy, your management skills, and your ability to work non-stop and motivate others to do the same. If you have all that in place, you most certainly will get what you deserve sooner or later. In most cases, later. But you will get it.

The concept of a startup company is completely different. A startup is a wild bet you're making on some crazy idea that makes "the world a better place." You're building a new Facebook, a Google killer, or a Snapchat replacement. Your goal is huge, while the investment is very small. Just a few years of work and you'll score hundreds of millions. This money will come not from happy customers and steadily growing revenue. Not at all. You will become filthy rich only when someone buys your startup.

These two strategies contradict each other: a traditional business vs. a startup.

Customers and revenue are not the goal of a startup, but rather an instrument that helps you achieve the real goal: valuation. You are supposed to use your revenue to convince investors that the prototype works and your valuation is already high enough. Your steadily growing cash flow right now must be used as a demonstration of a future customer acquisition model. But it is not the result by itself. It is just a tool in your hands.

Your valuation is what makes you rich, not your revenue or your happy customers. Of course, the revenue is important, but only as long as it serves the main goal---increase valuation at an extremely fast pace. The revenue is not the goal; it's the way to achieve it. The valuation is the goal.

Startup valuation must grow fast, ideally doubling itself every few months. If that's not happening, close the business and start something else. The valuation must skyrocket or you have to abandon the startup and try your hand on another one as soon as possible.

One of the biggest mistakes a startup founder can make is to forget about this "skyrocketing valuation" principle and focus on making customers happy and grow revenue. You will most likely kill your startup and maybe turn it into a lifestyle business.

Savvy investors will avoid you, mostly because they understand that growing revenue is just one of many other elements of a growing valuation. If you're focused on just one element, you most likely won't multiply your valuation a hundred times over the next year. Maybe you will multiply revenue a few times, but who cares? The revenue is good for you but not really good for investors.

What is good for investors? What do they want to see you doing to convince them that you're working hard on making the valuation grow? I'll try to cover that in one of the next articles.

© Yegor Bugayenko 2014–2018

Printers Instead of Getters

QR code

Printers Instead of Getters

  • Palo Alto, CA
  • comments

Getters and setters are evil. No need to argue about this, it's settled. You disagree? Let's discuss that later. For now, let's say, we want to get rid of getters. The key question is how is it possible at all? We do need to get the data out of an object, right? Nope. Wrong.

Le fabuleux destin d'Amélie Poulain (2001) by Jean-Pierre Jeunet
Le fabuleux destin d'Amélie Poulain (2001) by Jean-Pierre Jeunet

I'm suggesting to use "printers" instead. Instead of exposing data via getters, an object will have a functionality of printing itself to some media.

Let's say this is our class:

public class Book {
  private final String isbn =
    "0735619654";
  private final String title =
    "Object Thinking";
}

We need it to be transferred into XML format. A more or less traditional way to do it is via getters and JAXB:

import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement
public class Book {
  private final String isbn =
    "0735619654";
  private final String title =
    "Object Thinking";
  @XmlElement
  public String getIsbn() {
    return this.isbn;
  }
  @XmlElement
  public String getTitle() {
    return this.title;
  }
}

This is a very offensive way of treating the object. We're basically exposing everything that's inside to the public. It was a nice little self-sufficient solid object and we turned it into a bag of data, which anyone can access in many possible ways. We can access it for reading, of course.

It is convenient to have these getters, you may say. We are all used to them. If we want to convert it into JSON, they will be very helpful. If we want to use this poor object as a data object in JSP, getters will help us. There are many examples in Java, where getters are being actively used.

This is not because they are so effective. This is because we're so procedural in our way of thinking. We don't trust our objects. We only trust the data they store. We don't want this Book object to generate the XML. We want it to give us the data. We will build the XML. The Book is too stupid to do that job. We're way smarter!

I'm suggesting to stop thinking this way. Instead, let's try to give this poor Book a chance, and equip it with a "printer":

public class Book {
  private final String isbn =
    "0735619654";
  private final String title =
    "Object Thinking";
  public String toXML() {
    return String.format(
      "<book><isbn>%s</isbn><title>%s</title></book>",
      this.isbn, this.title
    );
  }
}

This isn't the best implementation, but you got the idea. The object is not exposing its internals any more. We can't get its ISBN and its title. We can only ask it to print itself in XML format.

We can add an additional printer, if another format is required:

public class Book {
  private final String isbn =
    "0735619654";
  private final String title =
    "Object Thinking";
  public String toJSON() {
    return String.format(
      "{\"isbn\":\"%s\", \"title\":\"%s\"}",
      this.isbn, this.title
    );
  }
}

Again, not the best implementation, but you see what I'm trying to show. Each time we need a new format, we create a new printer.

You may say that the object will be rather big if there will be many formats. That's true, but a big object is a bad design in the first place. I would say that if there is more than one printer---it's a problem.

So, what to do if we need multiple formats? Use "media," where that printers will be able to print to. Say, we have an object that represents a record in MySQL. We want it to be printable to XML, HTML, JSON, some binary format and God knows what else. We can add that many printers to it, but the object will be big and ugly. To avoid that, introduce a new object, that represents the media where the data will be printed to:

public class Book {
  private final String isbn =
    "0735619654";
  private final String title =
    "Object Thinking";
  public Media print(Media media) {
    return media
      .with("isbn", this.isbn)
      .with("title", this.title);
  }
}

Again, it's a very primitive design of that immutable Media class, but you got the idea---the media accepts the data. Now, we want to print our object to JSON (this design is not really perfect, since JsonObjectBuilder is not immutable, even though it looks like one...):

class JsonMedia implements Media {
  private final JsonObjectBuilder builder;
  JsonMedia() {
    this("book");
  }
  JsonMedia(String head) {
    this(Json.createObjectBuilder().add(head));
  }
  JsonMedia(JsonObjectBuilder bdr) {
    this.builder = bdr;
  }
  @Override
  public Media with(String name, String value) {
    return new JsonMedia(
      this.builder.add(name, value)
    );
  }
  public JsonObject json() {
    return this.builder.build();
  }
}

Now, we make an instance of JsonMedia and ask our book to print itself there:

JsonMedia media = new JsonMedia("book");
book.print(media);
JsonObject json = media.json();

Voilà! The JSON object is ready and the book has no idea about what exactly what printed just now. We need to print the book to XML? We create XmlMedia, which will print the book to XML. The Book class stays small, while the complexity of "media" objects is unlimited.

My point here is simple---no getters, just printers!

© Yegor Bugayenko 2014–2018

Jare.io, an Instant and Free CDN

QR code

Jare.io, an Instant and Free CDN

  • Palo Alto, CA
  • comments
badge

CDN stands for a Content Delivery Network. Technically, it is a bunch of servers located in different countries and continents. You give them your logo.gif and they give you a URL, which resolves to a different server depending on who is trying to resolve it. As a result, the file is always close to the end-user and your website loads much faster than without a CDN. Sounds good, but all CDN providers want money for their service and usually a rather complex setup and registration procedure. My pet project jare.io is a free CDN that is simple to configure. It utilizes AWS CloudFront resources.

First, let me show how it works and then, if you're interested in the details, I will explain how it's done internally. Say you have this HTML:

<img src="//www.teamed.io/image/logo.svg"/>

I want this logo.svg to be delivered via a CDN. There are two steps. First, I register my domain at jare.io:

The figure

Second, I change my HTML:

<img src="//cf.jare.io/?u=http://www.teamed.io/images/logo.svg"/>

That's it.

Try it with your own resources and you will see how much faster they will be loaded.

It's absolutely free, but I ask you to be reasonable. If your traffic is huge, you need your own account in CloudFront or somewhere else. My service is for small projects.

Now for more technical details, if you want to know how technically this solution works. First, let's discuss what CDN is and how it works.

URL, DNS, TCP, HTTP

When your browser wants to load an image, it has a URL for that, like in the example above. This is the URL: http://www.teamed.io/image/logo.svg. There are three important parts in this address. First is http, the protocol. Second is www.teamed.io, the host name, and the tail /images/logo.svg, which is the path. To load the image, the browser has to open a socket, connecting your computer and the server, which has the image. To open a socket, the browser needs to know the IP address of the server.

There is no such address in that URL. In order to find the IP address, the browser is doing what is called a lookup. It connects to the nearest name server and asks "what is the IP address of www.teamed.io?" The answer usually contains a single IP address:

$ nslookup www.teamed.io
Server:   172.16.0.1
Address:  172.16.0.1#53

Non-authoritative answer:
www.teamed.io canonical name = teamed.github.io.
teamed.github.io  canonical name = github.map.fastly.net.
Name: github.map.fastly.net
Address: 199.27.79.133

IP address of www.teamed.io is 199.27.79.133, at the time of writing.

When the address is known, the browser opens a new socket and sends an HTTP request through it:

GET /images/logo.svg HTTP/1.1
Host: www.teamed.io
Accept: image/*

The server responds with an HTTP response:

HTTP/1.1 200 OK
Content-Type: image/svg+xml

[SVG image content goes here, over 1000 bytes]

That is the SVG image we're looking for. The browser renders it on the web page and that's it.

The Network of Edge Servers

So far so good, but if the distance between your browser and that IP address is rather large, loading the image will take a lot of time. Well, hundreds of milliseconds. Try to load this image, which is located on a server that is hosted in Prague, Czech Republic (I'm using curl as suggested here):

$ curl -w "@f.txt" -o /dev/null -s \
  http://www.vlada.cz/images/vlada/vlada-ceske-republiky_en.gif
    time_namelookup:  0.005
       time_connect:  0.376
   time_pretransfer:  0.377
 time_starttransfer:  0.566
                    ----------
         time_total:  0.567

I'm trying to do it from Palo Alto, California, which is about half a globe away from Prague. As you can see, it takes over 500ms. That's too much, especially if a web page contains many images. Overall, page loading may take seconds, just because the server is too far away from me. Well, it will inevitably be too far away from some users, no matter where we host it. If we host it here in California, it will be close enough to me and the image will be loaded instantly (less than 50ms). But then it will be too slow for users in Prague.

This problem has no solutions if the server generates images or pages on the fly in some unique way and if we can't install a number of servers in different countries and continents. But in most cases, such as our logo example, this is not a problem. This logo doesn't need to be unique for each user. It is a very static resource, which needs to be created only once and be delivered to everybody, without any changes.

So, how about we install a server somewhere here in California and let Californian users connect to it. When a request for logo.gif comes to one of the edge servers, it will connect to the central server in Prague and load the file. This will happen only once. After that, the edge server will not request the file from the central server. It will return it immediately, from its internal cache.

We need to have many edge servers, preferably in all countries where our users may be located. The first request will take longer, but all others will be much faster because they will be served from the closest edge server.

Now, the question is how the browser will know which edge server is the closest, right? We simply trick the domain name resolution process. Depending on who is asking, the DNS will give different answers. Let's take cf.jare.io, for example (it is the name of all edge servers responsible for delivering our content in AWS CloudFront, a CNAME for djk1be5eatcae.cloudfront.net). If I'm looking it up from California, I'm getting the following answer:

$ nslookup cf.jare.io
Server:   192.168.1.1
Address:  192.168.1.1#53

Non-authoritative answer:
cf.jare.io  canonical name = djk1be5eatcae.cloudfront.net.
Name: djk1be5eatcae.cloudfront.net
Address: 54.230.141.211

An edge server with IP address 54.230.141.211 is located in San Francisco. This is rather close to me, less than fifty miles. If I do the same operation from a server in Virginia, I get a different response:

$ nslookup cf.jare.io
Server:   172.16.0.23
Address:  172.16.0.23#53

Non-authoritative answer:
cf.jare.io  canonical name = djk1be5eatcae.cloudfront.net.
Name: djk1be5eatcae.cloudfront.net
Address: 52.85.131.217

An edge server with IP address 52.85.131.217 is located in Washington, which is far away from me, but very close to the server I was making the lookup from.

There are thousands of name servers around the world and all of them have different information about where that edge server cf.jare.io is physically located. Depending on who is asking, the answer will be different.

AWS CloudFront

CloudFront is one of the simplest CDN solutions. All you have to do to start delivering your content through their edge nodes is to create a "distribution" and configure it. A distribution is basically a connector between content origin and edge servers:

PlantUML SVG diagram

One of edge servers receives an HTTP request. If it already has that logo.svg in its cache, it immediately returns an HTTP response with its content inside. If its cache is empty, the edge server makes an HTTP request to the central server. This server knows about the "distribution" and its configuration. It makes an HTTP connection to the origin server, which is www.teamed.io and asks it to return logo.svg. When done, the image is returned to the edge server, where it is cached.

This looks rather simple, but it's not free and it's not that quick to configure. You have to create an account with CloudFront, register your credit card there, and get an approval. Then you have to create a distribution and configure it. You should then create that CNAME in your name server. If you're doing it for a single website, it's not a big deal. If you have a dozen websites, it's a time consuming operation.

Jare.io, a Middle Man

Jare.io is an extra component in that diagram, which makes your life easier:

PlantUML SVG diagram

Jare.io has a "relay," which acts as an origin server for CloudFront. All requests that arrive to cf.jare.io are dispatched to the relay. The relay decides what to do with them. The decision is based on the information from the HTTP request URI. For example, the request from the browser has this URI path:

/?u=http://www.teamed.io/images/logo.svg

Remember, the request is made to cf.jare.io, which is the address of the edge server. This exact URI arrives at relay.jare.io. The URI contains enough information to make a decision about which file has to be returned. The relay makes a new HTTP request to www.teamed.io and retrieves the image.

The beauty of this solution is that it's easy. For small websites, it is a free and quick CDN.

By the way, when we query the same image through jare.io (and CloudFront), it comes back much faster:

$ curl -w "@f.txt" -o /dev/null -s \
  http://cf.jare.io/?u=www.vlada.cz/images/vlada/vlada-ceske-republiky_en.gif
    time_namelookup:  0.005
       time_connect:  0.021
   time_pretransfer:  0.021
 time_starttransfer:  0.041
                    ----------
         time_total:  0.041

Most of the work is done by AWS CloudFront, while jare.io is just a relay that makes its configuration more convenient. Besides, it makes it free, because jare.io is sponsored by Zerocracy. In other words, my company will pay for your usage of CloudFront. I would appreciate if you kept that in mind and didn't use jare.io for traffic-intensive resources.

© Yegor Bugayenko 2014–2018

Unspoken Secrets of an Elevator Pitch

QR code

Unspoken Secrets of an Elevator Pitch

  • Palo Alto, CA
  • comments

Your success depends on the quality of your elevator pitch. Basically, your success doesn't depend on much of anything else. The pitch is king. You have to impress the catcher investor in just a few seconds while the elevator is still moving (that's where you are supposed to hunt for that prey). It's not rocket science; you can learn a few basic principles and become a expert. Here they are, the principles.

Huevos de oro (1993) by Bigas Luna
Huevos de oro (1993) by Bigas Luna

Look. Rule number 1: You must look right, like Mark Zuckerberg---the same gray T-shirt every day with the same smile. The way you look must demonstrate that you're part of the community. You're one of us! These three things will help: a beard, an electric skateboard, and an Apple Watch. If you're a girl, replace the beard with a yoga mat and you're all set. By looking that way, you're telling investors that your chances to succeed are way higher than anyone who doesn't have an Apple Watch and is a stranger because of that.

Innovate. These are the words you must include in your pitch: innovation, innovative, innovate. It's never too many. Put them (or their synonyms) everywhere; it's all for good. How about this: "Our innovative solution is an environmental breakthrough?" Nobody knows what it means, but everybody understands that you're a big-time innovator. Even if it's yet another calorie calculator. Investors must understand that you're not just making a software product; you innovate.

Complicate. Internet of Things in lieu of a thermometer, augmented reality instead of goggles with a camera, and global health rather than a database of dentists. You get the idea. Don't be specific; it's dangerous. They may ask you what kind of database it is and how it's different than the other 50 databases on the market. But if you say global health, they won't have any idea what you're talking about, which is exactly what you need. Learn those words and use them whenever possible. Here are some of my favorites for you to remember: business model canvas, viral mechanics, growth hacking, data-driven mentality, design thinking, and game-changing disruption.

Smile. You are super excited about this fantastic opportunity! You're extremely motivated to do this project with these absolutely great people! You're passionate about the idea and fully committed to implementing it! You love the team, the partners, the investors, and the customers---they're all very nice people you're happy to work with! You're very enthusiastic about everything! You love everyone! ... But be careful; don't overact or you may end up in a mental hospital.

Compare. Compare yourself to Uber; you will never lose. It was Airbnb a few years ago and Dropbox before that. Today it's Uber. Start your pitch with "It's like Uber but ..." No matter what you put after that, it will work. All investors are dreaming about getting a new Uber in their portfolio. You should help them make this dream come true. There's no need to be too complicated; just say you're "like Uber but ..." and your chances to score Round A will jump higher.

Imitate. Elon Musk! You have to mention this guy; you absolutely must. Try this: "As Elon Musk said in his book, it is very difficult to build a successful startup." You, being aware of wisdom like that, give them the impression that you're related to the Silicon Valley culture, which is what they want to feel. Always try to show that Elon Musk is a role model for you. Having a T-shirt featuring his face will also help. By the way, don't use Steve Jobs; it's not a trend anymore. Moreover, he is dead already. Stick to Musk.

Respect. Always remember that you and your potential investors are not equal. They have money, experience, reputation, and a Tesla. You have nothing. You must demonstrate that you always remember that. Always position yourself a bit lower, play newbie, ask questions, and imitate interest in everything they say. Never compete with them or even try to argue! They are always right, and they are simply awesome; you're just another nobody. Something like "I'm honored to be in the same room with you" is a very good opener.

Co-Found. You absolutely must have a co-founder, or even a few of them. They could be anybody. A roommate will work or just a friend of a friend who recently quit her full-time job. It doesn't matter who they are as long as you can call them co-founders. Put their pictures and names on your website and always say "we" instead of "I." Investors, in general, love to have a few phone numbers they can call when they realize their money is gone and there is nothing left. Also, it's more fun to invest into a group of people---investors look more serious in that case.

Radiate. Never say anything bad about anything or anyone. You literally have to be 100% in positive thinking mode. That competitor who is stealing your customers is "such a nice guy." That cup of coffee made of some dark powder is "very delicious." The programmers from Pakistan you found on Upwork for $10 per hour are "absolutely awesome." The marketing plan you invented yesterday is "simply great." Never admit any mistakes, risks, or threats. Simply ignore them. Stay positive. Investors like that.

Bluff. You have to mention that the market is absolutely huge, at least a hundred billion dollars. They won't be able to check anyway. Actually, nobody can check that, so don't worry. But don't give a round number; that would be a mistake! It should sound more like $73.5 billion. That's very convincing and will give them the confidence you need---the market is huge and you're not just saying it; you actually did some math. How can they not invest in such a huge and promising market?

Scream. Especially if you're in a public place, you must speak as if you're a bit deaf. Be loud. That will prove that you're absolutely confident in what you're saying, today and always. The investor will be sure you are capable of attracting mates customers and other investors, if for no other reason than because you're not afraid of being that ugly loud in public. You're 100% extrovert---you must demonstrate it shamelessly.

Beg. You may mistakenly think that this is a balanced transaction---they give money and you give part of your equity. You may expect that both parties are equally interested in this. But don't make that mistake. You're begging, and they are giving. Out of courtesy. That's a win position for you. By having money, they expect to be on top of others. This is what capitalism is about, right? Don't take this away from them. You must be ready to shine their shoes. With a big smile on your face.

Love. The truth is that investors don't want to make money. They want to make love. You're a perfect candidate. They are mostly old, rich, lonely, and miserable. Their success stories are behind them. All they have is money, which they are afraid to spend because they understand that they won't make it again. They are used-to-be successful gamblers, but not anymore. Give them what they want: love. Money is not so important, so long as you love each other. Remember, first they must fall in love with you, then they will send the terms sheet.

Did I miss anything?

Rehearse your pitch in front of the mirror in the bathroom three times every morning. I guarantee that in fewer than two weeks, you will sound like a proper SV startup founder. Well, a co-founder---don't forget that. Good luck; you will need a lot of it!

© Yegor Bugayenko 2014–2018

Try. Finally. If. Not. Null.

QR code

Try. Finally. If. Not. Null.

  • Palo Alto, CA
  • comments

There is a very typical mistake in pre-Java7 "try/finally" scenario, which I keep seeing in so many code reviews. I just have to write about it. Java7 introduced a solution, but it doesn't cover all situations. Sometimes we need to deal with non-AutoCloseable resources. Let's open and close them correctly, please.

Lock, Stock and Two Smoking Barrels (1998) by Guy Ritchie
Lock, Stock and Two Smoking Barrels (1998) by Guy Ritchie

This is how it looks (assuming we are in Java 6):

InputStream input = null;
try {
  input = url.openStream();
  // reads the stream, throws IOException
} catch (IOException ex) {
  throw new RuntimeException(ex);
} finally {
  if (input != null) {
    input.close();
  }
}

I already wrote about null and its evil nature. Here it comes again. If you just follow the rule of "not using NULL anywhere ever," this code would need an immediate refactoring. Its correct version will look like this:

final InputStream input = url.openStream();
try {
  // reads the stream, throws IOException
} catch (IOException ex) {
  throw new RuntimeException(ex);
} finally {
  input.close();
}

There is no null anymore and it's very clean. Isn't it?

There are situations when opening the resource itself throws IOException and we can't put it outside of try/catch. In that case, we have to have two try/catch blocks:

final InputStream input;
try {
  input = url.openStream();
} catch (IOException ex) {
  throw new RuntimeException(ex);
}
try {
  // reads the stream, throws IOException
} catch (IOException ex) {
  throw new RuntimeException(ex);
} finally {
  input.close();
}

But there should be no null, never!

The presence of null in Java code is a clear indicator of code smell. Something is not right if you have to use null. The only place where the presence of null is justified is where we're using third-party APIs or JDK. They may return null sometimes because... well, their design is bad. We have no other option but to do if(x==null). But that's it. No other places are good for null.

© Yegor Bugayenko 2014–2018

Wring.io, a Dispatcher of GitHub Notifications

QR code

Wring.io, a Dispatcher of GitHub Notifications

  • Frankfurt, Germany
  • comments
badge

I'm taking participation in over 50 repositories in GitHub. We manage all of our projects there. GitHub is sending me hundreds of emails every day. I'm serious. Hundreds! I tried to filter them somehow in Gmail, but it's not really possible. Gmail filters are not powerful enough to understand the difference between different types of notifications, and there are many other problems. I decided to create my own simple filtering machine. It's called wring.io.

The idea of wring.io is simple. First, I'm registering my sources of notifications (called "pipes"), such as GitHub. Then I'm giving wring.io permission to connect to GitHub on my behalf and fetch whatever is new there.

Then I'm configuring what should be filtered out, using text matching and/or regular expressions. Right after a new pipe is created, wring.io starts pulling all my sources and updating my inbox. All I need to do is delete new messages from my inbox when I'm done with them. That's it.

Let's see an example. First I'm creating a new pipe:

The figure

It's a JSON object. Property class must be set to io.wring.agents.github.AgGithub. This is the name of the Java class that will be pulling my notifications from GitHub. The project is open source, so you can see how the class actually works: AgGithub.

Property token must be set to the personal access token that I should create first in GitHub. The server will connect to GitHub on my behalf and under my credentials:

The figure

Property ignore must have an array of strings. Each item is a matching pattern. I can use a text or a regular expression. By default, it's a text. If exactly the same text is found in a notification, it will be ignored. To use a regular expression, I need to wrap it in slashes (for example /[a-z]+/). You may skip that property and just specify this JSON:

{
  "class": "io.wring.agents.github.AgGithub",
  "token": "your-personal-access-token"
}

Then I go to my inbox and read what's there.

This solution literally saves me hours of time now. Feel free to use it, it's absolutely free. Moreover, it's open source, so feel free to contribute.

© Yegor Bugayenko 2014–2018

Pimp Up Your Resume

QR code

Pimp Up Your Resume

  • Atlanta, GA
  • comments

There are tons of articles about resume writing. Literally, tons of them. And here's yet another one? Well, maybe ... but I don't think so. I'll try to give you a few practical hints for how to make your resume look "sexier," and how to position yourself beyond the "good programmer" category and into the superstar zone. It may take a few years to truly pimp up your CV, but when it's done, you will charge $100-plus per hour and face no hesitation from your clients in paying.

Delicatessen (1991) by Jean-Pierre Jeunet
Delicatessen (1991) by Jean-Pierre Jeunet

One Page, No Exceptions

I think it's obvious, but only one out of 10 resumes I've been getting each day fits onto one page. All the others take three or more pages, and this looks very unprofessional. If you can't explain yourself in one page, there will be doubts about your skills in scope management, which are very important for a software engineer. It shows you simply can't filter out what's non-important and focus on what really matters. Besides, it's just boring to read three pages.

Thus, strictly limit yourself to one page, no exceptions. Your resume is an "executive summary" of the product you're trying to sell. It's a marketing brochure. A sales flyer. A sticker on my refrigerator, if you wish. It has to be short and straightforward. Employers will either buy it or throw it away. They don't want to read it; they want to buy you. Or throw your brochure away. Four-page brochures have far fewer chances than one-pagers.

Don't Lie

No matter what you do with your resume, never lie about a single word. You can tell half of the truth, you can hide some information, and you can rephrase the truth, but never lie. You don't know who is reading your CV and which desk it will end up on. Be ready to answer for every word you have on it.

If you're saying that you're an "expert in JavaScript," be ready to explain what the key new features of ECMAScript version 6 are. If you can't, don't use the word "expert." The point is that you have to be ready to prove every word.

Sexy Photo on the Top

You want them to work with you, right? They want to see you. So a photo is a mandatory component of a CV. And try to make it look artistic. Ask your graphic designer friend to style it. Maybe even pay for this work. Just pay attention to the photo; it's very important.

Do I have to say that you must smile on that photo? Well, yes, you must. And make it casual, with a T-shirt and funny background. You must look relaxed and successful. You don't want to get hired; they want to hire you---this is the message your photo should send, just like in online dating.

Skip "Objective" and "Title"

"Senior Software Developer," "Seasoned Java Programmer," "Talented IT Professional," etc. It's boring and doesn't sell you at all. They know what you are, because they are reading your resume.

Besides that, you're limiting yourself with that title. Maybe they are looking for a VP of engineering while your resume says "Software Architect." That's immediately a mismatch for them. It's a strike against you. Your name is the title of your resume. That's it.

A Dozen Skills

This section of your resume actually tells them about your "tech focus." It must have a very short list of skills, definitely under 12. You simply can't be an expert in MySQL, PostgreSQL, Oracle, and MS SQL at the same time. If there are too many skills, it's a sign of a "jack of all trades" who is almost always a "master of none." Don't do that.

Find the most important skills in your profile and put them there. Just a few. And make sure the skills are all on the same level of abstraction. Java and AngularJS must not be present together. Java is a few levels higher than AngularJS. Thus, it's either "Java, SQL, and HTTP" or "AngularJS, Spring Framework, and Web Sockets." I would recommend you stay at the lowest level you can until you become a serious market figure. For example, "Java" as a skill would look good in Jon Skeet's resume, because he definitely knows the entire Java world, and the market has recognized that. But if you're a programmer with just three years of experience, how can you "know Java?" You barely know a few hundred classes from it. That's why it's better to state specifically which parts of Java you definitely know. Like I said, be as low-level and specific in your skills as possible.

StackOverflow Profile

No matter what anyone says, StackOverflow is the de-facto standard platform for asking and answering technical questions. Your presence there and your high rating send a clear message to your potential employer that you're a superstar (or a rising one). No so many people have 100K or more reputation points there. You must be one of them.

So even if you don't have a StackOverflow profile now, create one. Spend one hour on it every day for a few months, answering new questions. You will earn 1,000 or more reputation points. Well, provided you have something to say. That's enough for a start. Post a link to your profile right in your resume.

Even if you don't have much to say, be there. Read answers made by others, comment on them, try to help them, and correct them. Become an active part of the community.

GitHub Profile

GitHub is the de-facto standard platform for open source code. There are others like BitBucket, but---I hope---they will die sooner or later. As a modern software developer, you must be on GitHub. You have to contribute to some open source. You have to be visible in the open source world if you want to sell yourself high.

Your potential employer wants to see what the market is thinking about your code and about you. They are afraid of making a mistake by hiring you. Your presence in the open source world is a guarantee for them. Someone has already seen your code, and someone has already given some kudos to your projects. Someone virtually vouches for you. As a result, they will feel more comfortable in hiring you.

To be in the "elite," you don't have to spend all your time on open source projects. Just contribute to the ones you're using already. You're using Sinatra at work? Check its source code. You will find a lot of places that need improvement. Offer them your help and simply submit small pull requests here and there. Besides that, create your own products and market them. You will be surprised by how many users and followers you will attract in just a few years of such activity.

Certificates

Some may say they are not important. Maybe so. But your resume must have them. Some of them are not so difficult to get. With just a few weeks of study and a few hundred dollars, you are not just a Java programmer but a certified one. And there are not so many of those out there. There are millions of Java coders in the world, but only a few percent of them are certified. Regardless of whether you think it's important or not, get those certifications.

As many as you like. But stay away from BrainBench and similar sites. Well, you can get certified there, but don't put them into your resume. That will only demonstrate that you are proud of a very questionable achievement. It's not a good sign.

Sound Names and Numbers

It's a dangerous trick, so be careful. Here is how it works. You have to go through your entire professional history and find well-known names or big numbers. For example, 10 years ago I was helping a startup create software that had IBM as a subscriber. They managed to get some tracking, and in two months, IBM decided not to use them anymore. It's a true (and sad) story, but I can put something like this into my resume: "wrote software for IBM." Am I lying? Not really. If they ask me what exactly I did for IBM, I will explain. In most cases, they won't ask. They will just buy this big name and put my resume on top of others.

You can do the same with numbers. Here is another true story. A few years ago, I was helping a company configure a continuous integration pipeline. It was not a big deal, but the company was serving more than 5 million hits per day on its website. That's a big number. I had nothing to do with this substantial web traffic, but I was in that company for a few months. So I may say in my resume: "configured the delivery pipeline for a 5M-hits-per-day web store." If they ask me for details, I will be able to give them. I'm not lying.

Use this technique carefully, and never lie. But do it. Don't be scared. Your resume needs big names and numbers.

Blog

Create your own blog. Start writing. About what? About your everyday achievements. About the code you write and read. About what you observe in the office. About your thoughts and your plans. About the books you read. You absolutely need to have a blog if you want to position yourself as an expensive software engineer.

It doesn't need to be a very popular blog; don't focus on numbers. But it has to be properly created, designed, and hosted. Don't use WordPress, Blogger, or Tumblr. Instead, I would recommend you think about static site generators like Jekyll and host it all on GitHub Pages. That's what I'm doing.

Besides being a valuable addition to your resume, systematic and regular writing will help you structure your thoughts, plans, and decisions. Well, that's what I'm getting from my blog.

Ambitions

If you're young and your resume is not full of bright achievements yet, you can add "ambitions" section to it. There you say what you're planning to achieve, to impress your future employer. For example: "learn Go," "create a new open source CSS framework," "write a book," or "earn Oracle Java certification next year." This will demonstrate that even though you're young, you do care about your career and professional growth.

Education

I would limit yourself to a few letters in this area. Just "MSc" or "BSc" is enough. There's no need to say when you graduated and from which school. You can give those details later. Well, there are just two exceptions to this rule.

First, if you're a PhD, put that on top of your CV. It's important, and it's valuable, simply because there are not so many of them among programmers. Second is if your school is Stanford, MIT, or something similar. If that's the case, also put it on top of your CV.

In all other situations, just write "BSc" and that's it.

Conferences

Every year, you must give a few presentations at JavaOne. I'm kidding. About the JavaOne part, anyway. But the "every year" part is very true. You must regularly make some speeches somewhere, preferably at JavaOne. But until you get there, speak where you can. Well, where they accept you. Create a profile at lanyrd (or something similar) and regularly check which conferences are looking for speakers. Submit there and you will be surprised to see that a few of them will actually accept some of your ideas.

The easiest subjects to start with are stories about your practical experience with some modern technologies and tools, something like "How Docker Helps Us Optimize Delivery" or "Five Apache Spark Installation Issues." Just describe what you've done on a recent project. It doesn't really matter what you talk about. What matters is that you're visible. If the market accepts you, the employer will trust you more. That's exactly what you need in order to request a higher rate.

Career History

I'll be speaking for myself here. As an employer, I don't really care about your history at all. Moreover, if you have never worked anywhere full-time, I would probably be more interested in working with you. But that's just me, because I truly believe that modern offices and full-time jobs turn programmers into slaves (and not only programmers).

Other employers may think differently. Well, they most likely think differently. That's why you have to demonstrate with which offices you've spent 10 recent years of your life. I would recommend you keep this list short. Even if you've changed eight companies over the last two years, don't say that. Just three places is enough. That will show them you're a good slave---very loyal to previous masters. That's what they want to see, because they are planning to buy you and become your next master. Right? Sounds harsh? Isn't it true?

Also, your experience section must mention your achievements, not your duties. Instead of "managing 300+ AWS nodes" or "building mobile apps" it's better to say "created 300+ AWS nodes infrastructure" or "built a few mobile apps."

ACM, IEEE, JUG, and Other Memberships

These memberships mean literally nothing but will demonstrate that you are part of those communities. Just like with most other things mentioned above, employers will trust you more if the market already trusts you. These memberships don't really mean that anyone has recognized you, since you get them just by paying annual fees. But still, you're paying those fees, and most other applicants aren't. You're definitely more reliable than many others.

Mention Hobbies

I think information about hobbies is important. Some say it's not, but I believe that a personal "click" between you and your potential employer plays an important role. There is a human on that side. He or she is reading your CV, and he or she wants to like you---mostly in order to be comfortable making a hiring decision. Help him or her like you faster. Mention that you enjoy skiing, monkey-feeding in the zoo, and Jimmy Kimmel. Be creative, not boring. Just like you do in online dating.

Layout/Graphics

How should your one page look? Stand out! It must express your personality. Don't use the "resume templates" downloadable for free. Create your own layout and design. If you're not a designer, ask your Photoshop friend to do it for you. Actually, there is not much to do; just select the right font and add a few colors here and there.

This CV is your product. You made it. It's your baby. If it's just a Word document in a standard template, they will feel you didn't pay attention to it. You didn't even care to create that small but very important product nicely. How will you create their software? With the same attitude. Don't ruin the whole show with a careless design. That's the key word here---"careless." It doesn't need to be complex. It may be very simple. But it must be yours, made with care and love.


Want free advice on your resume? Send it to cv@yegor256.com, and I'll let you know what I think. I will reply to all emails, but be ready to hear mostly what's wrong. What's right you will know without me, when they pay you $200 per hour.

Look at these samples (they are good): @dozortsev, @leventov.

Here is mine and its longer and boring version.

© Yegor Bugayenko 2014–2018

How We Interview Programmers

QR code

How We Interview Programmers

  • Palo Alto, CA
  • comments
badge

At Zerocracy, we've been getting about 10 resumes every day from programmers who want to work with us. We don't do video or online coding interviews. We don't ask you to solve any puzzles or demonstrate your algorithm-writing abilities. Moreover, when we decide not to hire you, we honestly and openly explain why. And we almost never offend anyone. So how exactly does it work? There are a few basic principles I would like to share.

Wall Street (1987) by Oliver Stone
Wall Street (1987) by Oliver Stone

The Market Interviews You, Not Us

We believe that the market is a much better interviewer than any one of us. "Instead of demonstrating to us how great your code is, show it to the market and see what it says"---that's what we're saying to you, our candidate.

How does the market validate that code? Open source---that's what is the most convincing to us. We ask you to show us which open source products you have and how popular they are.

Then show us your blog, your conference talks, your hackathon gold medals, your certifications, and any other awards the market has given you. Don't convince us that you're cool; convince them. If they will be convinced, we will be glad to hire you. Isn't that objective? I believe it is.

Quiz

The way we understand quality of code is very different than what you might expect. Simply put, our quality bar is much higher. Besides that, the way we understand object-oriented programming is also very different. So occasionally we'll find that we simply don't trust the market as our only source of information, especially when the market has almost nothing to say about you. Some programmers come to us with zero open source experience, no certificates, and no public work. Still, they claim they are the best.

To put them to the test, we provide a piece of code and ask them to refactor it---just make it better. I think this approach perfectly demonstrates who is in front of me: a hacker or a designer. In this way, we filter out a lot of people who pay attention to minor implementation tricks but miss the bigger design issues.

The quiz is here. You can see how many pull requests there are already---all of them are from our candidates.

No Phone or Video Calls

Most companies talk a lot about diversity and equality, yet most of them will also require a Skype video call or at least a phone call before getting you on board. How does this really jive with the equality emphasis? A face-to-face interview is a very stressful process even for experienced and extroverted people. We can imagine how difficult it is for some programmers who are anti-social introverts just like me.

Video interviewing is a terrible practice unless you're hiring a stand-up comedian or a flight attendant. Programmers are not supposed to achieve their goals by interacting with people face to face. Well, at least not in our remote work mode. We expect you to write code and communicate via GitHub. Why on earth would we need to call you? We just don't do it, and I think that's how everybody else should operate. That's what true equality and diversity is---no phone or video calls.

We Explain When We Reject

You apply to us and spend time presenting yourself, talking to us, and demonstrating to us your skills and profile. We feel that we have to give something back, especially if we don't hire you. That's why we always explain what's wrong with your application. We are not hiding anything, and sometimes our responses may sound rather disturbing. You may hear something like this: "Your quiz solution is not what I would expect from an experienced developer" (I'm quoting one of our interviewers).

Unlike many other companies, we will never say something like, "Thanks for applying, but we decided not to proceed further. Wish you luck!" That's shallow and ignorant. But that's what most big companies do, including Google, Facebook, and other "no evils." Try to apply there, and you will see for yourself.

Instead, we believe that an honest and straightforward negative answer is exactly what our candidates are looking for in the case of a rejection. We understand that it's not the end of the world for you---you're going to continue learning and improving. Our feedback will help you. So why should we hide it behind that polite "good luck" answer? We won't. You will know exactly why you are not good enough for us.

Moreover, we are always trying to suggest a direction for improvement. We will recommend what to learn, what to do, and how to grow before coming back to us. I haven't seen a single company do that in my personal job searches from the past.


Try to apply; the form is here.

© Yegor Bugayenko 2014–2018

Holacracy or Autocracy? Both!

QR code

Holacracy or Autocracy? Both!

  • Palo Alto, CA
  • comments

I strongly believe that while it is very effective to structure an organization in a democratic and sociocratic way, a project should be managed completely different. A project should resemble a dictatorship, authoritarian or military hierarchy with a single strong, result-oriented leader who gives explicit orders that are never doubted by subordinates and an explicitly defined hierarchy.

Apocalypse Now (1979) by Francis Ford Coppola
Apocalypse Now (1979) by Francis Ford Coppola

According to Wikipedia at the time of writing, a holacracy exists when "authority and decision-making are distributed" while, on the other hand, autocracy exists when "supreme power is concentrated in the hands of one person."

When I say organization, I mean a team, a startup, a company, that sort of thing. It's something with a brand, an office, a business entity and a bank account. The role of an organization is very similar to the role of a country or a government: to provide security in exchange for freedom. Democracy in a country, as well as in a team, guarantees equality to its members, which is the most important component of security.

A holacracy, also known as a "flat organization," technically refers to the absence of bureaucracy, special privileges, expensive furniture and private parties for top management. In a flat team, the distance between the CEO and a junior programmer is very small. They sit together in the same room, eat in the same cafe, and discuss team strategy like friends. There are no "bosses" on a flat team, only "leaders." They don't give orders, they inspire. They don't punish, they celebrate success and mourn failure together with everybody. Well, that's the idea of a holacracy. And it actually works. I've seen it many times.

However, when we're talking about project management, this very same approach will have catastrophic consequences. A project is something very different than a team. A project is a "temporary endeavor undertaken to create a unique product, service, or result," according to PMBOK. A project is something that starts and ends. The key objective of a project is to end, while an organization's objective is to survive. See the difference? A new mobile app, a conference, a new release, a round of investments---these are examples of projects. They start, and they end. We don't want any of them to live forever; we want them to finish as soon as possible, and obviously with a positive outcome.

Because of this fundamental difference, a project must be managed by an authoritative person who gives orders and has enough guts to ensure those orders are obeyed. That person is called a project manager (PM). And the project will be successful only if its management structure is strictly hierarchical, just like in a military operation. A project cannot be flat, or it will fall apart.

Since a project is a temporary endeavor, it doesn't give security to its participants. And it doesn't take away our freedom. The arrangement is different: A project gives us money and takes our time. The project basically says to all of us, its participants: "Let's get it done and go our own ways." Having this philosophy in mind and understanding the motivation of everybody involved, the PM must use instruments that have nothing to do with what keeps the organization alive.

An organization/team/company/family will stay together for a long time if we value things like tolerance, respect, patience, equality, and appreciation.

To the contrary, a project will finish successfully if we value completely different things: discipline, subordination, awards, punishments, and rules.

To summarize my thoughts, I would say that a successful company combines these two approaches by being a matrix organization that promotes holacracy in the team and autocracy in the projects it is working on.

© Yegor Bugayenko 2014–2018

Are You Still Debugging?

QR code

Are You Still Debugging?

  • Palo Alto, CA
  • comments

Debugging is "a process of running a program/method interactively, breaking execution flow after each statement and showing..." In a nutshell, it is a very useful technique ... for a bad programmer. Or an old programmer who is still writing procedural code in C. Object-oriented programmers never debug their code---they write unit tests. My point here is that unit testing is a technique that completely replaces debugging. If debugging is required, the design is bad.

The Revenant (2015) by Alejandro G. Iñárritu
The Revenant (2015) by Alejandro G. Iñárritu

Let's say I'm a bad imperative procedural programmer, and this is my Java code:

class FileUtils {
  public static Iterable<String> readWords(File f) {
    String text = new String(
      Files.readAllBytes(Paths.get(f)),
      "UTF-8"
    );
    Set<String> words = new HashSet<>();
    for (String word : text.split(" ")) {
      words.add(word);
    }
    return words;
  }
}

This static utility method reads file content and then finds all the unique words in it. Pretty simple. However, if it doesn't work, what do we do? Let's say this is the file:

We know what we are,
but know not what we may be.

From it, we get this list of words:

"We"
"know"
"what"
"we"
"are,\n"
"but"
"not"
"may"
"be\n"

Now that doesn't look right to me ... so what is the next step? Either the file reading doesn't work correctly or the split is broken. Let's debug, right? Let's give it a file through an input and go step by step, tracing and watching the variables. We'll find the bug and fix it. But when a similar problem shows up, we'll have to debug again! And that's what unit testing is supposed to prevent.

We're supposed to create a unit test once, in which the problem is reproduced. Then we fix the problem and make sure the test passes. That's how we save our investments in problem solving. We won't fix it again, because it won't happen again. Our test will prevent it from happening.

However, all this will work only if it's easy to create a unit test. If it's difficult, I'll be too lazy to do it. I will just debug and fix the problem. In this particular example, creating a test is a rather expensive procedure. What I mean is the complexity of the unit test will be rather high. We have to create a temporary file, fill it with data, run the method, and check the results. To find out what's going on and where the bug is, I'll have to create a number of tests. To avoid code duplication, I'll also have to create some supplementary utilities to help me create that temporary file and fill it with data. That's a lot of work. Well, maybe not "a lot," but way more than a few minutes of debugging.

Thus, if you perceive debugging to be faster and easier, think about the quality of your code. I bet it has a lot of opportunities for refactoring, just like the code from the example above. Here is how I would modify it. First of all, I would turn it into a class, because utility static methods are a bad practice:

class Words implements Iterable<String> {
  private final File file;
  Words(File src) {
    this.file = src;
  }
  @Override
  public Iterator<String> iterator() {
    String text = new String(
      Files.readAllBytes(Paths.get(this.file)),
      "UTF-8"
    );
    Set<String> words = new HashSet<>();
    for (String word : text.split(" ")) {
      words.add(word);
    }
    return words.iterator();
  }
}

It looks better already, but the complexity is still there. Next, I would break it down into smaller classes:

class Text {
  private final File file;
  Text(File src) {
    this.file = src;
  }
  @Override
  public String toString() {
    return new String(
      Files.readAllBytes(Paths.get(this.file)),
      "UTF-8"
    );
  }
}
class Words implements Iterable<String> {
  private final String text;
  Words(String txt) {
    this.text = txt;
  }
  @Override
  public Iterator<String> iterator() {
    Set<String> words = new HashSet<>();
    for (String word : this.text.split(" ")) {
      words.add(word);
    }
    return words.iterator();
  }
}

What do you think now? Writing a test for the Words class is a pretty trivial task:

import org.junit.Test;
import static org.hamcrest.MatcherAssert.*;
import static org.hamcrest.Matchers.*;
public class WordsTest {
  @Test
  public void parsesSimpleText() {
    assertThat(
      new Words("How are you?"),
      hasItems("How", "are", "you")
    );
  }
}

How much time did that take? Less than a minute. We don't need to create a temporary file and load it with data, because class Words doesn't do anything with files. It just parses the incoming string and finds the unique words in it. Now it's easy to fix, since the test is small and we can easily create more tests; for example:

import org.junit.Test;
import static org.hamcrest.MatcherAssert.*;
import static org.hamcrest.Matchers.*;
public class WordsTest {
  @Test
  public void parsesSimpleText() {
    assertThat(
      new Words("How are you?"),
      hasItems("How", "are", "you")
    );
  }
  @Test
  public void parsesMultipleLines() {
    assertThat(
      new Words("first line\nsecond line\n"),
      hasItems("first", "second", "line")
    );
  }
}

My point is that debugging is necessary when the amount of time to write a unit test is significantly more than the time it takes to click those Trace-In/Trace-Out buttons. And it's logical. We all are lazy and want fast and easy solutions. But debugging burns time and wastes energy. It helps us find problems but doesn't help prevent them from reappearing.

Debugging is needed when our code is procedural and algorithmic---when the code is all about how the goal should be achieved instead of what the goal is. See the examples above again. The first static method is all about how we read the file, parse it, and find words. It's even named readWords() (a verb). To the contrary, the second example is about what will be achieved. It's either the Text of the file or Words of the text (both are nouns).

I believe there is no place for debugging in clean object-oriented programming. Only unit testing!

© Yegor Bugayenko 2014–2018

Design Patterns and Anti-Patterns, Love and Hate

QR code

Design Patterns and Anti-Patterns, Love and Hate

  • Palo Alto, CA
  • comments

Design Patterns are ... Come on, you know what they are. They are something we love and hate. We love them because they let us write code without thinking. We hate them when we see the code of someone who is used to writing code without thinking. Am I wrong? Now, let me try to go through all of them and show you how much I love or hate each one. Follow me, in alphabetic order.

The Shining (1980) by Stanley Kubrick
The Shining (1980) by Stanley Kubrick

Abstract Factory. It's OK.

Adapter. Good one!

Bridge. Good one!

Builder. Terrible concept, since it encourages us to create and use big, complex objects. If you need a builder, there is already something wrong in your code. Refactor it so any object is easy to create through its constructors.

Chain of Responsibility. Seems fine.

Command. It's OK.

Composite. Good one; check out this too.

Data Transfer Object. It's just a shame.

Decorator. My favorite one. I highly recommend you use it.

Facade. Bad idea. In OOP, we need objects and only objects, not facades for them. This design pattern is very procedural in its spirit, since a facade is nothing more than a collection of procedures.

Factory Method. This one seems OK.

Flyweight. It's a workaround, as I see it, so it's not a good design pattern. I would recommend you not use it unless there is a really critical performance issue. But calling it a design pattern ... no way. A fix for a performance problem in Java? Yes.

Front Controller. Terrible idea, as well as the entire MVC. It's very procedural, that's why.

Interpreter. It's OK, but I don't like the name. "Expression" would be a much better alternative.

Iterator. Bad idea, since it is mutable. It would be much better to have immutable "cursors."

Lazy Initialization. It's OK.

Marker. It's a terrible idea, along with reflection and type casting.

MVC. Bad idea, since it's very procedural. Controllers are the key broken element in this concept. We need real objects, not procedural controllers.

Mediator. I don't like it. Even though it sounds like a technique for decreasing complexity and coupling, it is not really object-oriented. Who is this mediator? Just a "channel" between objects? Why shouldn't objects communicate directly? Because they are too complex? Make them smaller and simpler, rather than inventing these mediators.

Memento. This idea implies that objects are mutable, which I'm against in general.

Module. If Wikipedia is right about this pattern, it's something even more terrible than the Singleton.

Multiton. Really bad idea. Same as Singleton.

Null Object. Good one. By the way, see Why NULL Is Bad

Object Pool. Good one.

Observer. The idea is good, but the name is bad, since it ends with -ER. A much better one would be "Source" and "Target." The Source generates events and the Target listens to them.

ORM. It's terrible and "offensive"; check this out.

Prototype. Good idea, but what does it have to do with OOP?

Proxy. Good one.

RAII. This is a really good one, and I highly recommend you use it.

Servant. A very bad idea, because it's highly procedural.

Singleton. It's the king of all anti-patterns. Stay away from it at all costs.

Specification. It's OK.

State. Although it's not implied, I feel that in most cases the use of this pattern results in mutability, a code characteristic that I'm generally against.

Strategy. A good one.

Template Method. is wrong, since implementation inheritance is procedural.

Visitor. A rather procedural concept that treats objects as data structures, which we can manipulate.


I have nothing against concurrency patterns either; they are all good, since they have almost nothing to do with object-oriented programming.

If you know some other design (anti-)patterns, let me know in the comments below. I'll add them here.

© Yegor Bugayenko 2014–2018

Defensive Programming via Validating Decorators

QR code

Defensive Programming via Validating Decorators

  • Palo Alto, CA
  • comments

Do you check the input parameters of your methods for validity? I don't. I used to, but not anymore. I just let my methods crash with a null pointer and other exceptions when parameters are not valid. This may sound illogical, but only in the beginning. I'm suggesting you use validating decorators instead.

Shi mian mai fu (2004) by Yimou Zhang
Shi mian mai fu (2004) by Yimou Zhang

Let's take a look at this rather typical Java example:

class Report {
  void export(File file) {
    if (file == null) {
      throw new IllegalArgumentException(
        "File is NULL; can't export."
      );
    }
    if (file.exists()) {
      throw new IllegalArgumentException(
        "File already exists."
      );
    }
    // Export the report to the file
  }
}

Pretty defensive, right? If we remove these validations, the code will be much shorter, but it will crash with rather confusing messages if NULL is provided by the client. Moreover, if the file already exists, our Report will silently overwrite it. Pretty dangerous, right?

Yes, we must protect ourselves, and we must be defensive.

But not this way, not by bloating the class with validations that have nothing to do with its core functionality. Instead, we should use decorators to do the validation. Here is how. First, there must be an interface Report:

interface Report {
  void export(File file);
}

Then, a class that implements the core functionality:

class DefaultReport implements Report {
  @Override
  void export(File file) {
    // Export the report to the file
  }
}

And, finally, a number of decorators that will protect us:

class NoWriteOverReport implements Report {
  private final Report origin;
  NoWriteOverReport(Report rep) {
    this.origin = rep;
  }
  @Override
  void export(File file) {
    if (file.exists()) {
      throw new IllegalArgumentException(
        "File already exists."
      );
    }
    this.origin.export(file);
  }
}

Now, the client has the flexibility of composing a complex object from decorators that perform their specific tasks. The core object will do the reporting, while the decorators will validate parameters:

Report report = new NoNullReport(
  new NoWriteOverReport(
    new DefaultReport()
  )
);
report.export(file);

What do we achieve with this approach? First and foremost: smaller objects. And smaller objects always mean higher maintainability. Our DefaultReport class will always remain small, no matter how many validations we may invent in the future. The more things we need to validate, the more validating decorators we will create. All of them will be small and cohesive. And we'll be able to put them together in different variations.

Besides that, this approach makes our code much more reusable, as classes perform very few operations and don't defend themselves by default. While being defensive is an important feature, we'll use validating decorators. But this will not always be the case. Sometimes validation is just too expensive in terms of time and memory, and we may want to work directly with objects that don't defend themselves.

I also decided not to use the Java Validation API anymore for the same reason. Its annotations make classes much more verbose and less cohesive. I'm using validating decorators instead.

© Yegor Bugayenko 2014–2018

How Expensive Is Your Outsourcing Team?

QR code

How Expensive Is Your Outsourcing Team?

  • Palo Alto, CA
  • comments

Let me put it this way: $15 per hour for a senior Java developer---is that cheap or expensive? It's cheap, right? Right. What would you say if I told you this cheap Java developer hardly writes two primitive lines of code per day? You're paying $600 every week but rarely getting anything back. How cheap is this Java guy now? My point is that using hourly rate as a cost indicator is a very bad idea, whether with outsourcing or in-house teams.

The Fan (1996) by Tony Scott
The Fan (1996) by Tony Scott

I actually decided to write this after a short sales meeting recently with a prospect from Illinois. He wanted to hire Zerocracy for his Java project and seemed to like our approach. I explained how we work, how we control quality, and why and how we're different from everybody else. He seemed to be impressed. Then, he asked, "How much do you charge?"

badge

I told him that we are also different in the way we bill for our work, because we don't charge for the time spent by our programmers sitting in front of monitors. Instead, we bill for results produced, merged, and delivered. I showed him this article about incremental billing. He seemed to understand the advantages of our approach, compared to the hourly salaries being paid by almost everybody else in the market.

Still, the question remained---how much?

What could I do? I had to give him an answer.

I told him that our best Java programmers earn $30 to $50 per hour and we add our margin on top of that, in the amount of 100 percent, for management. In the end, "one hour" will cost him $60 to $100. He ran away.

What did I do wrong? I think I know what it was. I didn't explain to him that, under different management, programmers deliver very different results in the same 60 minutes. By "very different," I mean dramatically different. Let me demonstrate the numbers (I actually did that already almost two years ago, in my How Much Do You Pay Per Line of Code? post, but will try again from a different perspective).

Take a look at yegor256/takes#430, a feature request in the Takes Framework, one of the projects we're managing. Let's see how much the project paid for the work done in this ticket:

  • 15 minutes to me for creating a new ticket
  • 30 minutes to @hdous for fixing it
  • 52 minutes to @pinaf for code review
  • 20 minutes to @ypshenychka for QA review

Assuming that an average price "per hour" is $50 ($25 for developers and $25 for our management), the total cost of this new feature was $97.50 (117 minutes).

Look at these two tickets again. #430 is the feature request and #493 is the pull request with two new Java files and code review comments.

Four people worked on this feature. If you put them all together in an office, full-time, with the same hourly rate, they will cost $800 per day (I'm not adding any management costs!). Now the question is whether they will be able to create eight new features every day.

If you're a manager, you know the velocity of your programmers. If you're a programmer, you know how much code you can write in a day. Now, honestly tell me if you find and solve eight bugs per day with that level of complexity, detailed code review, and precision of documentation? I seriously doubt it.

badge

In that How Much Do You Pay Per Line of Code? post, I actually did a comparison of a co-located project, where I was an architect, with a distributed one, where I also was an architect. My numbers tell me that a traditionally managed team is at least 10 times less productive than a team managed by Zerocracy under XDSD principles.

My point is that asking "What is your hourly rate?" is just as wrong as asking How Much for This Software? if we're talking about software developers motivated by results, not office slaves.

Instead, we should ask: How much can you do for $100?

As you can see, we can easily demonstrate how much we are capable of delivering for $100. Can you and your team do the same?

Thus, your ROI while working with an outsourced team mostly depends on their results per dollar, not dollars per hour. The first metric is difficult to calculate, and only the best teams will be able to do so. The second metric is absolutely misleading, but anyone will give it to you.

An outsourced team is expensive when its results per dollar are low, no matter how big or small its dollars-per-hour rate is. To the contrary, a team is financially very efficient if its results per dollar is high. It doesn't really matter what the value of the second metric is.

P.S. I'm going to show this article to that prospect who ran away. Maybe he will come back :)

© Yegor Bugayenko 2014–2018

Good Programmers Don't Work for Equity

QR code

Good Programmers Don't Work for Equity

  • Los Angeles, CA
  • comments

"You're a good programmer. I'm a great entrepreneur. This is a breakthrough idea. Help me build it. I don't have cash, but I will give you equity. Deal?" I hear this at least once a month, and I always say no. Not because I don't like your idea. Indeed, it is really interesting. And not because I'm too busy. I would definitely find time for a good idea. It's not that. I say no because I don't think you're a good entrepreneur.

Combien tu m'aimes? (2005) by Bertrand Blier
Combien tu m'aimes? (2005) by Bertrand Blier

So you want a good programmer to build your product. Or maybe a group of good programmers. And you are ready to give me some equity in exchange. That's reasonable.

But what is your part of the deal?

How much are you putting on the table?

You say that you're a good entrepreneur, right? How come you don't have money, then? How come you can't find someone to pay for the work of a good programmer?

I will create a product for you, but you will most certainly fail. You already failed. You failed to find initial investment to cover the startup expenses of the business. Why do you think you will succeed after the product is ready?

The point is that a good programmer will never work for equity. Not because a good programmer is greedy, or doesn't want to risk, or doesn't believe in new ideas. Not at all.

A good programmer wants to work with a good entrepreneur. And a good entrepreneur knows how to find money. That's the definition of a decent entrepreneur.

Period.

© Yegor Bugayenko 2014–2018

How Do You Punish Your Employees?

QR code

How Do You Punish Your Employees?

  • Kiev, Ukraine
  • comments

Punishment ... how do you prefer to do it? There are many ways to punish employees; some are rather effective, while others simply don't work. This is not an exact science. Actually, I would say it's an art. You must be creative, innovative, and very open-minded. You never know which method of punishment will work with whom. Some people respond to one method, while others may completely ignore it. The overarching goal, of course, is to make employees scared of you, their boss, so they will obey enthusiastically. Here is a list of the most effective methods.

Office Space (1999) by Mike Judge
Office Space (1999) by Mike Judge

Disclaimer: I'm using the pronoun "he" merely for simplicity of speech. The same exact rules apply to both males and females.

Use Your Voice. This, of course, is your best instrument of punishment. Make sure he is scared of you. He must know who he works for---you, his boss. The rule of thumb is that the one with the strongest voice is the boss. Thus, you must be heard, you must rule with your voice, and he must physically feel your presence in the room. Even if it's just a Skype call, your voice must sound stronger than all others. Moreover, don't let him speak if he is trying to argue back. You're the boss! The alpha!

Play Hard to Get. Is he feeling guilty for an error? Simply ignore him. Or, even better, schedule meetings and don't show up. Or reschedule many times. That will demonstrate that you're not interested in him anymore. His frustration will grow. You'll still be in the office, having meetings with other employees, eating lunch, laughing, walking, and talking. You exist, but not for him. He is dead to you, because of his mistake. He is nothing and is getting the silent treatment. Then, suddenly, you attend a meeting! Oh, how happy he will be. He will literally kiss your hands, and you will love that feeling!

Make Fun of Him. We inherit this technique from good, old-school bullying. We all know how it works. Select the person who messed up and make him the target of your jokes. In the office, this method works even better than in school, because you're the boss and he basically can't do anything to you. He will first try to laugh alongside everybody else, but this won't last for the long term. In the end, everybody will laugh at him, and he will do whatever it takes just to stop it. He will obey any order you give!

Mistakes Must be Visible. Is he wrong? Did he miss a deadline? Did he deploy a broken version to production? Did he forget something? Don't resolve this face to face. Always make such things public, simply to let others punish him. This approach is known as peer pressure, a very useful technique. His coworkers must keep up the momentum and punish him using social rejection. That's why, by all means, you as a leader must encourage back-stabbing. You will rise to power much faster if your employees are not only afraid of you but also of each other. Use their fear wisely!

Late-Night Phone Calls. Having a personal life outside the office is not for everybody. It is a luxury, and you are not just going to give it away. He must earn it, and if he is guilty of not completing a task on time, his private life will be ruined by your late-night phone calls. He must remember that. It doesn't really matter what exactly the calls are about. Just make them somewhere around 11 p.m. Your key message is this: "I'm worrying about the project while you're enjoying your family time!" Guilt is what you're planting with this. He won't be able to ask you not to call him after work; he is not that brave. He will instead try to please you somehow so that you stop calling him.

Don't Check Results. This technique is close to playing hard to get, but here you don't ignore him. You communicate with the guilty employee, but you don't talk about his results. You discuss his uncle's wedding, his snowboarding weekend, his new bicycle, etc. But you don't ask about the migration to PostgreSQL he has been working on for the last three weeks. You are not interested. This is a perfect method for demonstrating that you don't see him as a valuable team member anymore. The team doesn't need his results. The team can live without them. You will see how soon he realizes who the boss is and what it means when the boss is not happy!

A Bad Office Spot Is a Great De-Motivator. This is a classic instrument of punishment: The worst desks go to those who forget who the boss is around here. Obviously, the best desk is the one at which nobody can see your monitor. Give those to good people who obey your orders and don't argue with you. They are your core team. They support you as a leader, and they help you rise to power. Others must sit closer to the door, and their monitors must be seen by everybody. As with all other techniques, conceal your intentions---you locate people in the office due to their job descriptions in order to help them communicate effectively. Everybody will understand what's really going on, but you must look like a laissez-faire leader.

Easy Tasks Are Rewards. You decide who does what, and you distribute tasks and projects---that's your instrument of power. Easy-to-do tasks are how you reward those who are loyal and supportive. They complete such tasks, hardly expending any effort. Complex and risky projects, on the other hand, are assigned to the under-performing employee. He will most likely fail, and there won't be anyone to blame---it's just a project, like all others. Boring, ambiguous, unfocused, unnecessary, under-funded, and routine tasks all go to the employee who deserves punishment. Of course, you must look unbiased---be very polite and supportive, acting as if you're a good friend!

Spread Rumors. Simply show your annoyance of his poor results, but not to his face. Talk about his performance with his coworkers. They must know that you're not happy. Furthermore, they must suspect that you're thinking about terminating his contract. Don't say it straight away, but don't deny it if they ask. I doubt they will ask, though. Very soon, these rumors will reach his ear, and he will do whatever it takes just to hear that you're not thinking about termination anymore. He will likely be scared to ask you directly, but even if he does ask, deny it. He will be afraid of you anyway. That will make him much more manageable!

Overtime. Leaving the office at 5 p.m. is a privilege. Only the best employees can afford it. Ideally, everybody must ask you before they leave. The one who feels guilty won't be comfortable asking you whether he can leave at 5 p.m. He will stay longer just because he is afraid to ask. That's exactly what you need! Just to earn the ability to ask whether it's possible to leave the office, he will work harder. The question is how to make employees ask for your permission to go home in the first place. I recommend you stay late and schedule interesting meetings at 6 p.m. Of course, you will come to work at 1 p.m., while everybody else must be there at 9 a.m. sharp. The point is that you must be in the office when they leave, and you must do something important. They will be afraid to disappoint you by showing ignorance, so they will ask for permission. That's what you need!


badge

This list is definitely not exhaustive. I'm sure there are many more interesting methods and technologies. Don't hesitate to share them below in the comments. As I said above, I believe this is an art, not a science.

PS. If you like this article, you will certainly enjoy this Management Stripped Bare: What They Don't Teach You at Business School book by Jo Owen. I actually borrowed some ideas from that book.


If you like this article, you will definitely like these very relevant posts too:

How to Pay Programmers Less
Programmers are expensive and difficult to control; here are a few tricks to keep them underpaid and happy, for a while.

Team Morale: Myths and Reality
Team morale is a key performance driver in any group, especially a software development team; however, there are many myths about it.

How to Be a Good Office Slave
Office slavery is what most companies practice and what most office workers suffer from, often unconsciously.

© Yegor Bugayenko 2014–2018

Employee Turnover Is Good for the Maintainability of Your Code Base

QR code

Employee Turnover Is Good for the Maintainability of Your Code Base

  • Kiev, Ukraine
  • comments

This is what Wikipedia says about this: "High turnover may be harmful to a company's productivity if skilled workers are often leaving, and the worker population contains a high percentage of novices." I agree. However, I believe that low turnover may also be very harmful.

Commando (1985) by Mark L. Lester
Commando (1985) by Mark L. Lester

I've found this good article where John Sullivan explains why low turnover could be a troubling symptom. It's a really good read, but rather generic. It is not specifically about software teams. My experience is mostly focused on programmers and their turnover. I've learned that low turnover negatively affects code maintainability and encourages hero-driven development and strong code ownership (both of which are bad practices).

"Turnover" is basically the act of replacing an employee with a new employee for any reason, including termination, retirement, resignation, or any other. Simply put, the more people your team loses every year, the higher your turnover. If there are 20 programmers on your team, and five of them walk away every year, your turnover is 25 percent.

I can't pinpoint what number you should aim for, but I strongly believe that if you consider programmers to be a valuable long-term asset, and try to retain them at all cost, you're doing it wrong.

My point is that a healthy software team must replace programmers regularly. I would say having one person on board for longer than a year is asking for trouble.

By replacing, I don't necessarily mean firing. Not at all. I mean moving them away from the code base. Obviously, if you have a single code base, replacing will mean firing.

When programmers stay together for a long time, working on the same code base, they inevitably become subject matter experts. First of all, this leads to strong code ownership. Naturally, each of them becomes a specialist in his or her own part of the code, mostly because it's easier to work with something you're familiar with instead of jumping from module to module. Needless to say, strong code ownership is a bad practice. Collective code ownership is a much better alternative, as explained by Martin Fowler.

Then, having strong experts on the team inevitably leads to hero-driven development, where firefighting is very much appreciated. An expert doesn't want to lose his or her position, and always tries to demonstrate how valuable he or she is for the team. The best way to do this is to solve a problem that nobody else can solve. That's how one gets "job security." And that's how the team starts to degrade. This blog post by Fredrik Rubensson is right about this problem.

Thus, to achieve higher maintainability of the source code and robustness of the product, we must rotate programmers, preventing them from becoming subject matter experts.

I realize this idea sounds counter-intuitive, but think about it. By keeping people together, working on the same problem for a long time, we put a lot of knowledge into their heads, not our source code. These people become the asset. They become smarter, they know the solution very well, and they solve all issues rather quickly. But the code base degrades.

When the time comes to change someone (for any reason), the loss will be damaging. We may lose significant knowledge, and the code base left behind will be unmaintainable. In most cases, we will have to re-write it. That's why in most software teams, management is afraid of programmers. They are scared to lose key software developers, because the consequences may be fatal.

In the end, programmers control management, not the other way around.

It's Not a School!---This earlier post explains how this problem can be solved without firing or rotating programmers, but few teams, especially co-located ones, can afford it. If your team can't, just try to keep your turnover high enough to prevent the appearance of heroes (a.k.a. subject matter experts).

© Yegor Bugayenko 2014–2018

Why Don't You Contribute to Open Source?

QR code

Why Don't You Contribute to Open Source?

  • Kiev, Ukraine
  • comments

In my How Much Do You Cost? post last year, I said open source contribution is a very important factor in defining who is good and who isn't, as far as programmers go. I was saying that if you're not contributing to open source, if your GitHub profile is not full of projects and commits, your "value" as a software developer is low, simply because this lack of open source activity tells everybody that you're not passionate about software development and are simply working for money. I keep getting angry comments about that every week. Let me answer them all here.

Kung Fu Hustle (2004) by Stephen Chow
Kung Fu Hustle (2004) by Stephen Chow

The gist of all those comments is this: "I don't contribute to open source, but I'm still very passionate about software development." Then, there is a list of reasons why the author of the comment doesn't contribute:

  • I spend my free time with my family.
  • I'm already busy in the office; why should I do extra work?
  • I'm well-paid; why should I do anything for free?
  • My employer doesn't allow me to contribute to open source.
  • My company won't pay me for writing open source code.

Good excuses, but let's try to look at it from a different perspective.

Today, it's not possible at all to create software without using open source components. I'm sure nobody will argue with this. Only something very basic and simple can be created without code reuse. Nah, I'm wrong. Even super small pieces of software can't be created without open source "neighbors." You need at the very least an operating system and a programming language. In most cases, they are open source (Microsoft is an exception, and it must die).

Thus, no matter what software you're creating, you're using modules created for you by others. Someone else spent his or her time to help you.

Now, you're not giving anything back. I'm curious, why is that?

There could be two reasons. The first one is that you just don't care. They give you something, and you're not giving anything back. You simply don't feel like being a player in this market. You take their libraries, reuse them in your product, collect a paycheck, and go home. You don't care what will happen with the industry, with those programmers, with the language you're writing in, with the platform, etc. You don't want to improve the libraries, you don't want to create and share new ones, you don't want to report bugs and feature requests, and you don't want to send patches and pull requests to them.

I do understand that. Millions of programmers are like that; you're not alone. But please, don't tell me that you're passionate about software development. Just admit that you don't care. It's not a crime, after all. You're not stealing anything (although I actually think you are, but that's a different story).

That was the first reason why you may not contribute. However, in most cases, my typical opponent tells me he or she does care, but just can't. There are obstacles, right? Your family is taking all your free time, and in the office, you simply are not allowed to work on something that is outside of your business scope. I can imagine that, but let's see what's happening behind the scenes.

You're telling me that your company doesn't care about the software industry, right? They don't allow you to give anything back to the open source community. They want you to use those free libraries and give nothing back. And it is their corporate strategy. I doubt that's the case. Did you ask your CTO about it?

I strongly believe that in 95 percent of cases, when you explain that your software seriously depends on a few open source libraries that may need some improvements, your boss will have nothing against you becoming a contributor. Try it.

Sometimes, the boss says he or she doesn't care about any open source and wants you to focus on your product. Maybe this happens rather often; I don't know.

In that case, my next question is philosophical. You're working for such a person and such a company. You're accepting their paychecks. Aren't you a part of this team and this mentality? If you don't walk away, you accept this attitude. You're part of it. It's you who doesn't care, not just them. Because of your existence, they have an ability to not care.

Tomorrow if they ask you to use stolen software, you may say you had no choice: "My boss asked me to do this. I did care about copyright and strongly believed that software authors must be paid, but I had to steal, because that's what my company asked me to do." Does it sound like a good excuse?

The same story goes for open source. If you do care and you're passionate about software development, you will either contribute actively or walk away from the company that doesn't share your passion. What, you can't walk away because of some reasons? Then don't tell me about your passion. Simply admit that you're too weak to follow your passion.

Again, it's not a crime. It's just who you are.

© Yegor Bugayenko 2014–2018

Investors Are Too Scared

QR code

Investors Are Too Scared

  • Palo Alto, CA
  • comments

We're starting a new thing, a seed fund. Its name is SeedRamp. The formula is simple: You schedule an interview, we have a one-hour conversation, you present me your startup idea, and we either give you cash right away or explain why we don't feel like it. We don't do any due diligence or background checks. The decision is made right there. It's something similar to angel investment, but the amount is smaller---less than $20K, and decisions are faster.

The Game (1997) by David Fincher
The Game (1997) by David Fincher

There are basically three problems we're trying to solve with this new idea: 1) Investors are cowards, 2) investors are cowards, and 3) investors are cowards. Here is why.

They Are Afraid of Strangers

It's no secret that Silicon Valley is very "corrupted" territory, where in order to get access to money people, you must know some other money people or someone who knows someone, etc. You must be well-connected in order to be successful. You simply can't raise money just by having an awesome idea or even a great implementation. You need connections.

I think that's disgusting.

I am, in general, a big fan of meritocracy, where those who are smarter or stronger win the most. This is the principle we, at Zerocracy, apply to software developers who come to us. I explained it last year in this rather popular and provocative post: How Much Do You Cost?. We simply don't care how many years of experience you have, how much time you've spent with your previous employer, or how many programming languages you know. We only care about your objective achievements, which are validated by the market. And, of course, we don't pay attention to any references or any previous relationships.

I strongly believe this is how it should be.

Unfortunately, this is not how it is in Silicon Valley when a young startup is looking for $100K to $150K of seed money. Angel investors are difficult to reach. They are afraid of you, if you come out of nowhere. They only want to talk to someone they can complain about to their friends. This basically encourages startup founders to spend their time on friend-making activities instead of business-making ones. Very often, good teams simply miss their chance.

They Are Afraid of Telling the Truth

Have you ever talked to a venture capital firm? To angel investors? To any investors, basically? If you have, you'd know they all are very polite, nodding their heads and smiling while listening to your pitch. They usually are "very excited" to meet you and "learn more" about your business.

In the end, they don't give you the money.

Why? Who knows. They won't tell you. They are cowards, and they are afraid of telling you that your idea sucks and your business plan is totally wrong. They are afraid of being honest.

There is an almost identical situation with recruitment. You send your resume to Facebook, they interview you, and you spend a few hours with them, answering their questions. In the end, they email you, saying "We decided not to proceed any further; good luck in your job search." They are afraid of telling you the truth.

And it's disgusting.

In our recruitment process at Zerocracy, we do exactly the opposite. You apply to us, we ask one of our programmers to interview you, and then, when finished, we make a decision about whether you're a good candidate or not. We discuss your profile right in front of you. We don't have any discussions without you. We make our decision fully disclosing our reasoning to you. This is how it should be everywhere, I believe. Especially with regard to investments.

They are Afraid of Losses

It's a very infamous problem, mentioned everywhere there's a list of "top 10 reasons for startup failures." Investors simply turn you into an employee. Before you get their money, you're on your own. You make your own decisions, you manage your business, you're in charge.

Then, you get the money. It doesn't mean you're rich. Not at all. It means that, from now on, you're their employee. They decide what your salary is. They decide whether you can rent this office or not. They decide what car you can afford.

Keep in mind that your salary is lower than your friends are getting, working somewhere on Facebook. Your salary is low, and you can't change it. All your expenses have to be approved. You're simply under the full control of your board.

Why is that? Because they are afraid of you being free. They are afraid of losing their money. That's why they are doing everything they can to keep a close eye on you.

It's disgusting and very counter-productive.

It's similar to trying to win in poker by always making small bets. In most cases, they lose their money, you lose your time, and the market loses the opportunity to get a new product.

We Are Not Cowards

badge

SeedRamp is going to solve all of these three problems.

First of all, we completely remove the necessity to have any connections in order to reach us. You need money? Just schedule an appointment online. We don't care who you are, where you're coming from, or who you know. We give you one hour of our time, and if we reject your idea, you can apply again in a month. Thus, any young startup without any friends or connections is welcome. Just bring your strategy, your existing results, and your passion, and we'll talk.

Second, we don't say, "We will call you back." We give you our reasons right away, and we always tell the truth. Moreover, we record our interview and publish it on YouTube. Yes, that's not a joke; we will publish all interviews online, and you can see how we talk about other startups. We are not afraid of telling the truth; it's part of our marketing strategy.

Third, we don't sit on your board after the investment is made. We simply give you a check, and you can fly to Vegas the next day and spend all of that money there. We don't care. When our decision is made, we don't tell you what to do with the money. If we gave you the money, we believed in you and your judgment. If you think that the best use of this money would be a new Kawasaki, do it.

Instant Micro Investments

To make it all happen, we have a few principles and limitations.

First, we expect you to ask for enough cash for one calendar month. You simply have to explain to us how much you need for one month and how you're planning to spend this money, approximately. One calendar month. We expect you to come back to us in a month, demonstrate your progress, and ask for one more month. Of course, we may say no.

Second, the maximum we can give you is $20,000. Maybe, in the future, we will raise this limit. For now, it is $20K.

And one last thing. We will ask you to give us some future equity in your startup. You decide how much, but it has to be enough to make us interested. It all depends on your situation. A few percent, I'd guess.

We're planning to host our first interviews in the middle of January 2016. You can schedule them right here.

© Yegor Bugayenko 2014–2018

Temporal Coupling Between Method Calls

QR code

Temporal Coupling Between Method Calls

  • Kiev, Ukraine
  • comments

Temporal coupling happens between sequential method calls when they must stay in a particular order. This is inevitable in imperative programming, but we can reduce the negative effect of it just by turning those static procedures into functions. Take a look at this example.

Blueberry (2004) by Jan Kounen
Blueberry (2004) by Jan Kounen

Here is the code:

class Foo {
  public List<String> names() {
    List<String> list = new LinkedList();
    Foo.append(list, "Jeff");
    Foo.append(list, "Walter");
    return list;
  }
  private static void append(
    List<String> list, String item) {
    list.add(item.toLowerCase());
  }
}

What do you think about that? I believe it's clear what names() is doing---creating a list of names. In order to avoid duplication, there is a supplementary procedure, append(), which converts an item to lowercase and adds it to the list.

This is poor design.

It is a procedural design, and there is temporal coupling between lines in method names().

Let me first show you a better (though not the best!) design, then I will try to explain its benefits:

class Foo {
  public List<String> names() {
    return Foo.with(
      Foo.with(
        new LinkedList(),
        "Jeff"
      ),
      "Walter"
    );
  }
  private static List<String> with(
    List<String> list, String item) {
    list.add(item.toLowerCase());
    return list;
  }
}

An ideal design for method with() would create a new instance of List, populate it through addAll(list), then add(item) to it, and finally return. That would be perfectly immutable, but slow.

So, what is wrong with this:

List<String> list = new LinkedList();
Foo.append(list, "Jeff");
Foo.append(list, "Walter");
return list;

It looks perfectly clean, doesn't it? Instantiate a list, append two items to it, and return it. Yes, it is clean---for now. Because we remember what append() is doing. In a few months, we'll get back to this code, and it will look like this:

List<String> list = new LinkedList();
// 10 more lines here
Foo.append(list, "Jeff");
Foo.append(list, "Walter");
// 10 more lines here
return list;

Is it so clear now that append() is actually adding "Jeff" to list? What will happen if I remove that line? Will it affect the result being returned in the last line? I don't know. I need to check the body of method append() to make sure.

Also, how about returning list first and calling append() afterwards? This is what possible "refactoring" may do to our code:

List<String> list = new LinkedList();
if (/* something */) {
  return list;
}
// 10 more lines here
Foo.append(list, "Walter");
Foo.append(list, "Jeff");
// 10 more lines here
return list;

First of all, we return list too early, when it is not ready. But did anyone tell me that these two calls to append() must happen before return list? Second, we changed the order of append() calls. Again, did anyone tell me that it's important to call them in that particular order?

Nobody. Nowhere. This is called temporal coupling.

Our lines are coupled together. They must stay in this particular order, but the knowledge about that order is hidden. It's easy to destroy the order, and our compiler won't be able to catch us.

To the contrary, this design doesn't have any "order":

return Foo.with(
  Foo.with(
    new LinkedList(),
    "Jeff"
  ),
  "Walter"
);

It just returns a list, which is constructed by a few calls to the with() method. It is a single line instead of four.

As discussed before, an ideal method in OOP must have just a single statement, and this statement is return.

The same is true about validation. For example, this code is bad:

list.add("Jeff");
Foo.checkIfListStillHasSpace(list);
list.add("Walter");

While this one is much better:

list.add("Jeff");
Foo.withEnoughSpace(list).add("Walter");

See the difference?

And, of course, an ideal approach would be to use composable decorators instead of these ugly static methods. But if it's not possible for some reason, just don't make those static methods look like procedures. Make sure they always return results, which become arguments to further calls.

© Yegor Bugayenko 2014–2018

Throwing an Exception Without Proper Context Is a Bad Habit

QR code

Throwing an Exception Without Proper Context Is a Bad Habit

  • San Jose, CA
  • comments

I keep repeating the same mistake again and again. So it's time to stop and make a rule to prevent this from happening anymore. The mistake is not fatal, but it's very annoying. When I look at production logs, I often see something like "File doesn't exist", and I ask myself: What file? Where is it supposed to exist? What did the server try to do with it? What was going on a second before it crashed? There is no answer in the log, and it's totally my fault. I either 1) don't re-throw or 2) re-throw without providing context. Both are wrong.

Four Rooms (1995) by Allison Anders et al.
Four Rooms (1995) by Allison Anders et al.

This is how the code may look:

if (!file.exists()) {
  throw new IllegalArgumentException(
    "File doesn't exist"
  );
}

It may also look like this:

try {
  Files.delete(file);
} catch (IOException ex) {
  throw new IllegalArgumentException(ex);
}

Both examples demonstrate an inadequate style of handling situations that involve exceptions and reporting them. What's wrong here? The exception messages are not thorough enough. They simply don't contain any information from the place where they originated from.

This is how they should look instead:

if (!file.exists()) {
  throw new IllegalArgumentException(
    String.format(
      "User profile file %s doesn't exist",
      file.getAbsolutePath()
    )
  );
}

And the second example should look like this:

try {
  Files.delete(file);
} catch (IOException ex) {
  throw new IllegalArgumentException(
    String.format(
      "Can't delete user profile data file %s",
      file.getAbsolutePath()
    ),
    ex
  );
}

See the difference? This may look like redundant code, but it's not. Of course, when I'm writing all this, I don't really care about logs and exceptions. I'm not really expecting this file to be absent.

But I should.

There should be a rule: Every time we throw or re-throw, an exception message must describe the problem with as much detail as possible.

Of course, we can't forget about security and risk putting any sensitive information into the exception message, like passwords, credit card numbers, etc. Besides that, as much as possible must be exposed to the exception catcher at a higher level.

Throwing an exception is literally an escalation of a problem to a higher level of management. Imagine that my boss is asking me to install a new server. I come back to him in a few hours and say, "I failed; sorry." That would sound strange. He would ask for more details. Why did I fail? What exactly went wrong? Is it possible to do it differently? Etc.

Such code is literally a sign of disrespect to the client:

throw new IllegalArgumentException(
  "File doesn't exist"
);

I have to be more verbose and give more details.

And I'm not alone in this mistake. I see it everywhere, and it really makes debugging difficult, especially in production, where it's almost impossible to reproduce the problem right away.

Thus, please be more verbose in your exception messages. I will do the same in my code :)

And one more thing before you go. In most OOP languages, exceptions are unchecked, which means that catching them is not a mandatory operation, unfortunately. Nevertheless, I recommend you catch, add context, and re-throw them all, always. This may seem like pure noise, but it's not! Just make your methods smaller and ensure all exceptions sent out of them have enough information about their origins. You will do yourself and everybody else a big favor.

© Yegor Bugayenko 2014–2018

Imprisonment for Irresponsible Coding!

QR code

Imprisonment for Irresponsible Coding!

  • Palo Alto, CA
  • comments

If I drive too fast and I get caught, I may get a ticket. If I drive under the influence and get caught, I may go to jail. If I turn my radio up too loud in the middle of the night and my neighbors call the police, I may get into trouble if I don't stop it. The law basically protects us from causing trouble with each other. Why don't we have a law against irresponsible coding?

Thursday (1998) by Skip Woods
Thursday (1998) by Skip Woods

Software is part of my life. It actually is my life, not just part of it. I stare at this MacBook for much more time every day than I drive, talk, or listen to the radio.

Code is the territory where I interfere with others, and this is where we may bother each other. Irresponsible coding is precisely how one of us can really disturb the other. So why don't the police protect me against, say, authors of Apache Hadoop?

They created something that turns part of my life into a nightmare---much faster and much more severely than drunk drivers. So, where is the police? Why aren't they protecting me, for my tax dollars? Why aren't those Java guys in jail yet? :)

We need a law against irresponsible coding!

How about two months of imprisonment for a Singleton?

© Yegor Bugayenko 2014–2018

Ringelmann Effect vs. Agile

QR code

Ringelmann Effect vs. Agile

  • Berlin, Germany
  • comments

The Ringelmann Effect (a.k.a. social loafing) is basically about people experiencing decreasing productivity when working in groups. We're basically more productive when we work individually to achieve personal goals rather than being teamed up. That was discovered by Prof. Max Ringelmann a hundred years ago in 1913. Today, during my workshop in Berlin at DATFlock 2015, we tried to reproduce that experiment. It seems the French professor was right.

The Business (2005) by Nick Love
The Business (2005) by Nick Love

Here is what we did. We created two groups with four people in each of them, all non-native English speakers. Then, both groups received the same task---to create as many words as possible using the letters in a single given word. It's a pretty simple task that just requires some time and creativity.

badge

The first group worked as a team. They had just one piece of paper and one pen to write down the words they found. We called them a co-located team.

The second group of four people worked in a distributed mode---they had four pieces of paper and four pens. They didn't communicate with each other and just created words. They knew that the best performer would receive a prize (a bar of organic chocolate).

I promised a prize to the co-located team too. A very similar chocolate bar.

We gave them both just 5 minutes.

Our result was this: 38 words found by a co-located team and 41 words found by a distributed team. Of course, we removed duplicates and non-English words.

The distributed team was 8 percent more productive than the co-located one.

Of course, this may not be a clear experiment, and we can't use these numbers to really prove anything, but it was interesting to see how groups work and what actually motivates us to achieve results. We had an hour-long discussion afterward in an attempt to find out what each group member felt while working in a group or individually.

You can try to repeat this (or a similar experiment) in your team and check the results. Post them below in the comments; it would be interesting to see whether it does or doesn't work in your case.

Now, my main question. If I understand it right, Agile promotes group responsibility and discourages individualism. How does it go along with the Ringelmann Effect? Any thoughts?

© Yegor Bugayenko 2014–2018

Stop Comparing JSON and XML

QR code

Stop Comparing JSON and XML

  • Palo Alto, CA
  • comments

JSON or XML? Which one is better? Which one is faster? Which one should I use in my next project? Stop it! These things are not comparable. It's similar to comparing a bicycle and an AMG S65. Seriously, which one is better? They both can take you from home to the office, right? In some cases, a bicycle will do it better. But does that mean they can be compared to each other? The same applies here with JSON and XML. They are very different things with their own areas of applicability.

The Men Who Stare at Goats (2009) by Grant Heslov
The Men Who Stare at Goats (2009) by Grant Heslov

Here is how a simple JSON piece of data may look (140 characters):

{
  "id": 123,
  "title": "Object Thinking",
  "author": "David West",
  "published": {
    "by": "Microsoft Press",
    "year": 2004
  }
}

A similar document would look like this in XML (167 characters):

<?xml version="1.0"?>
<book id="123">
  <title>Object Thinking</title>
  <author>David West</author>
  <published>
    <by>Microsoft Press</by>
    <year>2004</year>
  </published>
</book>

Looks easy to compare, right? The first example is a bit shorter, is easier to understand since it's less "cryptic," and is also perfectly parseable in JavaScript. That's it, then; let's use JSON and manifest the death of XML! Who needs this heavyweight 15-year-old XML in the first place?

Well, I need it, and I love it. Let me explain why.

And don't get me wrong; I'm not against JSON. Not at all. It's a good data format. But it's just a data format. We're using it temporarily to transfer a piece of data from point A to point B. Indeed, it is shorter than XML and more readable. That's it.

badge

XML is not a data format; it is a language. A very powerful one. Let me show you what it's capable of. Let me basically explain why I love it. And I would strongly recommend you read XML in a Nutshell, Third Edition by Elliotte Rusty Harold and W. Scott Means.

I believe there are four features XML has that seriously set it apart from JSON or any other simple data format, like YAML for example.

  • XPath. To get data like the year of publication from the document above, I just send an XPath query: /book/published/year/text(). However, there has to be an XPath processor that understands my request and returns 2004. The beauty of this is that XPath 2.0 is a very powerful query engine with its own functions, predicates, axes, etc. You can literally put any logic into your XPath request without writing any traversing logic in Java, for example. You may ask "How many books were published by David West in 2004?" and get an answer, just via XPath. JSON is not even close to this.

  • Attributes and Namespaces. You can attach metadata to your data, just like it's done above with the id attribute. The data stays inside elements, just like the name of the book author, for example, while metadata (data about data) can and should be placed into attributes. This significantly helps in organizing and structuring information. On top of that, both elements and attributes can be marked as belonging to certain namespaces. This is a very useful technique during times when a few applications are working with the same XML document.

  • XML Schema. When you create an XML document in one place, modify it a few times somewhere else, and then transfer it to yet another place, you want to make sure its structure is not broken by any of these actions. One of them may use <year> to store the publication date while another uses <date> with ISO-8601. To avoid that mess in structure, create a supplementary document, which is called XML Schema, and ship it together with the main document. Everyone who wants to work with the main document will first validate its correctness using the schema supplied. This is a sort of integration testing in production. RelaxNG is a similar but simpler mechanism; give it a try if you find XML Schema too complex.

  • XSL. You can make modifications to your XML document without any Java/Ruby/etc. code at all. Just create an XSL transformation document and "apply" it to your original XML. As an output, you will get a new XML. The XSL language (it is purely functional, by the way) is designed for hierarchical data manipulations. It is much more suitable for this task than Java or any other OOP/procedural approach. You can transform an XML document into anything, including plain text and HTML. Some complain about XSL's complexity, but please give it a try. You won't need all of it, while its core functionality is pretty straight-forward.

This is not a full list, but these four features really mean a lot to me. They give my document the ability to be "self-sufficient." It can validate itself (XML Schema), it knows how to modify itself (XSL), and it gives me very convenient access to anything inside it (XPath).

There are many more languages, standards, and applications developed around XML, including XForms, SVG, MathML, RDF, OWL, WSDL, etc. But you are less likely to use them in a mainstream project, as they are rather "niche."

JSON was not designed to have such features, even though some of them are now trying to find their places in the JSON world, including JSONPath for querying, some tools for transformations, and json-schema for validation. But they are just weak parodies compared to what XML offers, and I don't think they have any future. Or let's put it this way: I wish they would disappear sooner or later. They just turn a good, simple format into something clumsy.

Thus, to conclude, JSON is a simple data format with no additional functionality. Its best-use case is AJAX. In all other cases, I strongly recommend you use XML.

© Yegor Bugayenko 2014–2018

10 Typical Mistakes in Specs

QR code

10 Typical Mistakes in Specs

  • Palo Alto, CA
  • comments

There is a great book called Software Requirements written by Karl Wiegers about, well, software requirements. It's a must read for every software engineer, in my opinion. There's no need for me to repeat what it says, but there are a few very simple and very typical mistakes we keep making in our specs. I see them in our documents again and again, which is why I've decided to summarize them. So here they are, the ten most critical and typical of them, from the point of view of a programmer reading a specification document.

Reservoir Dogs (1992) by Quentin Tarantino
Reservoir Dogs (1992) by Quentin Tarantino

Chapter 4.3 of a famous standard IEEE 830-1998 says that a good specification should be correct, unambiguous, complete, consistent, ranked, verifiable, modifiable, and traceable. Eight qualities in total. Then, the standard explains them one by one in pretty simple English. But do we have time to read those boring standards? They are for university professors and certification boards. We are practitioners, for goodness sake! ... Hold on, I'm joking.

badge

No matter how small the project is and how practical we are, there is always a document that explains what needs to be done, and it may be called the "software requirements specification," or "specification," or just "spec." Of course, there is a lot of space for creativity, but we're engineers, not artists. We must follow rules and standards, mostly because they make our communication easier.

Now, I'm getting to my point. The specs I usually see violate pretty much all eight principles mentioned earlier. Below is a summary of how exactly they do it. By the way, all examples are taken from real documents in real commercial software projects.

No Glossary or a Messy One

How about this:

UUID is set incrementally to make sure there
are no two users with the same account number.

What is the difference between UUID and account number? Is it the same thing? It seems so, right? Or maybe they are different ... it would be great to know what UUID stands for. Is it "unique user ID" or maybe "unified user identity descriptor?" I have no idea. I'm lost, and I want to find the author of this text and do something bad to him ... or her.

I've written already that the worst technical specifications have no glossaries. In my experience, this is the biggest problem in all requirement documents. It's not prose! It's not a love letter! It's technical documentation. We can't juggle words for the sake of fun. We should not use product specs just to express ourselves. We're writing in order to be understood, not to impress the reader. And the rule is the same here as with diagrams: if I don't understand you, it's your fault.

Here is how that text would look after a proper re-writing:

UUID is user unique ID, a positive 4-bytes integer.
UUID is set incrementally to make sure there
are no two users with the same UUID.

Better now?

Thus, the first and biggest problem is a frivolous use of terms and just words, without having them pre-defined in a glossary.

Questions, Discussions, Suggestions, Opinions

I've seen this very recently in a product spec:

I believe that multiple versions of the API
must be supported. What options do we have? I'd
suggest we go with versioned URLs. Feel free to
post your thoughts here.

Yes, this text exists verbatim in a requirements document. First, the author expresses his personal opinion about the subject. Then, the author asks me what possible options are out there. Then, he suggests I consider something, and after that, he invites me for a talk.

Impressive, right? Obviously, the author has a very creative personality. But we should keep this person as far away from project documentation as possible. This is not what a requirements document appreciates. Well, we appreciate creativity, but these four things are strictly prohibited: questions, discussions, suggestions, and opinions.

Specifications can't have any questions in them. Who are these questions addressed to? Me, a programmer? Am I supposed to implement the software or answer your questions? I'm not interested in brainstorming with you. I expect you, a requirements author, to tell me what needs to be done. Find all your answers before writing the document. That's what you're paid for. If you don't have the answers, put something like TBD ("to be determined") there. But don't ask questions. It's annoying.

A requirements document is not a discussion board. As a reader of the spec, I expect to see exactly what needs to be done without any "maybe" or "we could do it differently." Of course you need to discuss these issues, but do it before documenting it. Do it somewhere else, like in Skype, on Slack, or by email. If you really want to discuss in the document, use Google Docs or Word with version tracking. But when the discussion is over, remove its history from the document. Its presence only confuses me, a programmer.

There's no need to format requirements as suggestions either. Just say what needs to be done and how the software has to work without fear of being wrong. Usually, people resort to suggestion when they are afraid to say it straight. Instead of saying "the app must work on Android 3.x and higher," they say "I would suggest making the app compatible with Android 3.x and higher." See the difference? In the second sentence, the author is trying to avoid personal responsibility. He's not saying "exactly Android 3.x;" he's just suggesting. Don't be a coward; say it straight. If you make a mistake, we'll correct you.

And, of course, opinions are not appreciated at all. It's not a letter to a friend; it's a formal document that belongs to the project. In a few months or weeks, you may leave the project, and somebody else will work with your document. The spec is a contract between the project sponsor and project team. The opinion of a document author doesn't make any difference here. Instead of noting "it seems Java would be faster" and suggesting "we should use it," say "Java is faster, so we must use it." Obviously you put it there because you thought so. But once it's there, we don't care who it came from and what you thought about this problem. The information would just confuse us more, so skip it. Just facts, no opinions.

Don't get me wrong, I'm not against creativity. Programmers are not robots, quietly implementing what the document says. But a messy document has nothing to do with creativity. If you want me to create, define the limits of that creativity and let me experiment within them; for example:

Multiple versions of the API must be supported. How exactly
that is done doesn't really matter.

This is how you invite me to be creative. I realize that the user of the product doesn't really have any justifications or expectations for the versioning mechanisms in the API. I'm free to do whatever I can. Great, I'll do it my way.

But again, let me reiterate: A specification is not a discussion board.

Mixing Functional and Quality Requirements

This is how it looks:

User must be able to scroll down through
the list of images in the profile smoothly and fast.

It's a typical mistake in almost every spec I've seen. Here, we mix together a functional requirement ("to scroll images") and and a non-functional one ("scrolling is smooth and fast"). Why is it bad? Well, there is no specific reason, but it exhibits a lack of discipline.

Such a requirement is difficult to verify or test, difficult to trace, and difficult to implement. As a programmer, I don't know what is more important: to scroll or to make sure the scrolling is fast.

Also, it is difficult to modify such a statement. If tomorrow we add another functional requirement---scrolling a list of friends, for example---we'll want to require this scrolling to also be smooth and fast. Then, a few days later, we'll want to say that "fast" means less than 10 milliseconds of reaction time. We'll then have to duplicate this information in two places. See how messy our document may become eventually?

Thus, I would strongly recommend you always document functional and non-functional requirements separately.

Mixing Requirements and Supplementary Docs

This is similar to a previous problem and may look like this:

User can download a PDF report that includes a full
list of transactions. Each transaction has ID,
date, description, account, and full amount. The report
also contains a summary and a link to the user account.

It's obvious there are two things described in this paragraph. First is that a user can download a PDF report. Second is how this report should look. The first thing is a functional requirement, and the second one must be described in a supplementary document (or appendix).

In general, functional requirements must be very short: "user downloads," "user saves," "client requests and receives," etc. If your text gets any longer, there is something wrong. Try to move part of it to a supplementary document.

Un-measurable Quality Requirements

This is what I'm talking about:

Credit card numbers must be encrypted.
The app should launch in less than 2 seconds.
Each web page must open in less than 500 milliseconds.
User interface must be responsive.

I can find many more examples just by opening requirement specs in many projects I've seen over the past few years. They all look the same. And the problem is always the same: It is very difficult to define a truly testable and measurable non-functional requirement.

Yes, it's difficult. Mostly because there are many factors. Take this line, for example: "The app must launch in 2 seconds." On what equipment? With what amount of data in the user profile? What does "launch" mean; does it include profile loading time? What if there are launching problems? Do they count? There are a lot of questions like that.

If we answer all of them, the requirement text will fill an entire page. Nobody wants that, but having un-measurable requirements is a greater evil.

Again, it's not easy, but it's necessary. Try to make sure all quality requirements are complete and without ambiguity.

Implementation Instructions

This example illustrates a very common pitfall:

User authenticates via Facebook login button
and we store username, avatar, and email in the
database.

This is micromanagement, and it's something a requirements analyst should never do to a programmer. You shouldn't tell me how to implement the functionality you desire. You want to give a user the ability to login via Facebook? Say so. Do you really care whether it's going to happen through a button click or somehow else? Do you really care what I store in the database? What if I use files instead of a database? Is that important to you?

I don't think so. Only in very rare cases will it matter. Most of the time, it's just micromanagement.

The spec should only require what really matters for the business. Everything else is up to us, the programmers. We decide what database to use, where the button will be placed, and what information will be stored in the database.

If you really care about that because there are certain higher-level limitations---say so. But again, not as implementation instructions to us programmers, but rather as non-functional requirements like this:

Login page must look like this (screenshot attached).
We must store user email locally for future needs.

The point is that I have nothing against requirements, but I'm strongly against implementation instructions.

Lack of Actor Perspective

The text may look like:

PDF report is generated when required. It is
possible to download a report or save it
in the account.

The problem here is that there is no "actor" involved. This functionality is more or less clear, but it's not clear who is doing all this. Where is the user? It is just a story of something happening somewhere. That's not really what programmers need in order to implement it.

The best way to explain functionality is through user stories. And a good user story always has, guess what ... a user. It always starts with "the user ...," followed by a verb. The user downloads, the user saves, the user clicks, prints, deletes, formats, etc.

It's not necessary for the user to be a human. It may be a system, a RESTful API client, a database, anything. But always someone. "It is possible to download ..." is not a user story. It's possible for who?

Noise

How about this:

Our primary concern is performance and an attractive
user interface.

This is noise. As the reader of this document, I'm neither an investor nor a user. I'm a programmer. I don't care what your "primary concern" is in this project. My job is to implement the product so that it matches the specs. If performance is your primary concern, create measurable and testable requirements for me. I will make sure the product satisfies them. If you can't create a requirement, don't spam me with this irrelevant information.

I don't want to share your concerns, your beliefs, or your intentions. That's your business. And you're paid to properly and unambiguously translate all that into testable and measurable requirements. If you can't do this, it's your problem and your fault. Don't try to make it mine.

Very often ... wait. Very, very often. No. Almost always. Wrong again. Always! That's right, spec documents are always full of noise. Some of them have a bit less; some have more. I believe this is a symptom of lazy and unprofessional document authors. In most cases, just lazy.

They don't want to think and translate their concerns, ideas, thoughts, intentions, and objectives into functional and non-functional requirements. They just put them into the document and hope the programmers will somehow find the right solution. Good programmers should figure out what good performance means, right? Let's just tell them that performance is a concern for us, and they will figure something out.

No! Don't do that. Do your job right and let programmers do theirs.

And we, programmers, should never accept such documents. We should just reject them and ask requirements authors to re-work and remove noise. I would recommend not even starting to work on a product if there is a lot of noise in its specs.

Will Work, Needs to Work, Must Work

This is yet another very typical mistake:

The API will support JSON and XML. Both formats
must fully support all data items. XML needs to
be validated by XSD schema.

See how messy it sounds? There are three different points of view, and none of them are suitable for a specification document. A spec must describe a product as if it already exists. A spec must sound like a manual, a tutorial, or a reference. This text must be re-written like this:

The API supports JSON and XML. Both formats
fully support all data items. XML is validated
by XSD schema.

See the difference? All the "must," "need," and "will" words are just adding doubt to the document. For a reader of this spec, "the API will support" sounds like "some time in the future, maybe in the next version, it will support." This is not what the author had in mind, right? There should be no doubt, no double meaning, no maybe. The API supports. That's it.


I may have forgotten something important, but these issues are so obvious and so annoying ... I'm going to use this post as a simple guide for our system analysts. Feel free to share your experience with requirements documents below in the comments.

© Yegor Bugayenko 2014–2018

A Chatbot Is Better Than a UI for a Microservice

QR code

A Chatbot Is Better Than a UI for a Microservice

  • Seattle, WA
  • comments

A chatbot (or chatterbot, as Wikipedia says) is a piece of software that talks to you in chat format. We use chatbots in a few (micro)services, and they fully replace user interfaces. I don't think there is any innovation in this approach, but it has proved to be very effective over the last year or so. That's the impetus for this post. Here is how the Rultor chatbot works for us and what its benefits are.

Let me give an example first. Look at the jcabi/jcabi-http#115 GitHub ticket:

The figure

Let's see what's going on here, and then we'll discuss how it's designed inside. Essentially, I'm talking to a chatbot here. The name of the chatbot is @rultor (I wrote about it last year). At 1, I'm asking the chatbot to release a new version of the jcabi-http library. At 2, the chatbot responds, just confirming that the task is clear and that it's on it. At 3, the bot says the job is completed and its completion took nine minutes. Our conversation is over. That's it.

Now, what is so special about this?

One thing: There is no user interface. Well, there is no traditional web-based HTML/CSS user interface. There is no login, logout, profile, menu, or anything like this. Rultor is a web service that has no web UI. The only way to communicate with it is by talking with its chatbot.

What's so good about it? A few things.

Service Is Not a Server

This is how the traditional architecture of a web system would look:

PlantUML SVG diagram

A user gives instructions to a service and receives responses. This communication happens through a user interface (UI)---a bunch of HTTP entry points that receive requests from a browser and return HTML+CSS responses. Or, if a user is on another service, requests may contain some data, and responses will be in XML or JSON. You get the idea; a user is a client, and the service is a server.

Like in a restaurant---you say what you want, and a server goes to the kitchen, waits there, and in a few minutes, comes back with spaghetti carbonara. You're a client, and that cute lady is a server.

In the case of a chatbot, that's not the case anymore. Look at the architecture:

PlantUML SVG diagram

First, a user posts a request to GitHub through a web user interface provided by GitHub. It is a communication hub for us. Then, the service connects to GitHub through its RESTful API and checks whether there are any new requests there. If something new is found, the service does the job, prepares a response, and posts it there. The client receives an email notification about a new response just posted to the ticket. The client then checks GitHub and finds the response.

badge

Here is how this would look in a restaurant: There would be a board with sticky notes. First, you write the note, "I'd like spaghetti carbonara with parmesan and fresh pepper on top" (Damn, I'm just too hungry now), and pin it to the board at number 15. Then, you return to your table. A chef from the kitchen checks that board and finds your sticky note. He makes that spaghetti, tops it with parmesan, fresh pepper, some basil leaves, and virgin olive oil ... yeah, he makes it right ... and puts it next to the board. You hear an announcement that order number 15 is ready. You go there, collect the food, return to your table, and enjoy.

The point is that there is no cute lady involved anymore. There is no server. There are two parties communicating with the board---you and the kitchen. The kitchen is our microservice, but it's not a server anymore.

These two parties are perfectly decoupled now. They never talk to each other. And they both are clients of the communication hub, which is GitHub or a board in the restaurant.

Again, the microservice is not a server anymore. Instead, it is a client of a communication hub. And the flip of its position provides a lot of benefits to us, its developers.

No Need to Be Fast

badge

First of all, we don't need to care much about the performance of our UI. Well, we don't care at all, since we don't have a UI. Do we care about the speed of responses on GitHub? Not really. When a user posts a message to GitHub, he or she doesn't expect our chatbot to give an immediate answer in less than 100 milliseconds. (That's what any properly designed web system must guarantee, I believe.)

We put a note on the board, and we assume that the kitchen is probably doing something else at the moment. We'll wait for a few seconds or even minutes. If, on the other hand, I give an order to the waitress and she waits five seconds before replying back, I'll be very surprised. If she keeps doing that with every question, I'll start to wonder to myself if everything is OK with her.

I expect a user interface to be instant, while in a chat I have no problem allowing some time for the bot to answer. This happens naturally. We're used to delays in chats, when we're talking with real people. They need some time to process our information, to think, and to type something back.

But a user interface doesn't have that luxury. It has to be bullet-fast; otherwise, I immediately get frustrated. The same thing happens to you, right?

No Need to Look Cute

badge

Another advantage of this no-server design is that there is no need to look pretty. There is no web interface, no HTML, no CSS, no graphic design. Perhaps not everybody really likes that. Most non-professional users may still prefer to talk to a cute server instead of sticking some paper notes to the board. But if we're dealing with professional computer engineers, they're not that demanding.

Rultor doesn't have any web UI, and its users simply don't know how it "looks." It just talks to you. The only thing you see is its avatar in GitHub.

This saves a lot of money and time on design efforts, which are usually very expensive if you aim for high quality. If your web service looks average, most of its users will assume that it also works average. Many good ideas have simply died because their UI wasn't as impressive as people were used to, thanks to all those Pinterests and Instagrams.

A good-looking server has a greater chance for bigger tips, right? If there is no server and we don't see the chef, we judge him or her only by the quality of the food.

Same here. By getting rid of a UI, we allow ourselves to focus on the quality of the service we're delivering. We don't burn our time and money on being nice. We spend them on being useful.

Much Easier to Scale

badge

If we have too many stickies on that board, we just hire more cooks, or maybe even build another kitchen, and the problem is solved. We can handle as many customers as necessary. Well, as long as the board is powerful enough to handle multiple parallel users.

GitHub is a pretty big platform, with hundreds of thousands of users and projects. If we have too many requests coming in, we can just add more processing nodes to Rultor. Remember, we're not a server anymore; we are a client of GitHub. We decide when to connect to GitHub and when to create responses to the requests submitted.

It is much easier to create a scalable client than a scalable server, mostly because there is nobody really waiting for us to respond quickly. The load of requests we're getting can be managed much easier, since the decision of when to process them is made by us.

Mistakes Are Not So Visible

badge

When you're standing in front of a customer, most of your mistakes are unforgivable, primarily because they are very visible. On the other hand, when you're cooking something in the kitchen, nobody can see you and spot your faults. They will only spot them if the spaghetti has too much salt. In other words, they will judge you by your results, not by how you produce them.

It's the same story with the microservice. When it works as a server, we expect it to be seamless, respond immediately, and present everything in a structured and organized way. If something goes wrong, it's right here on the web page. Your best case is a 404, while the worst one is that you present some wrong information to the user. Even though the bug may not be critical inside the microservice engine, the user doesn't know that. The user will judge you by your appearance and won't forget even small mistakes.

However, when you both are clients of a message board, you don't see each other. The user communicates with GitHub, and the microservice interacts with GitHub. Mistakes are less visible. Trust me, we have had many of them over the 18 months that Rultor has been in public use. We've had downtimes, we've had serious logical mistakes, and we've had data corruption. But very rarely have these problems become visible online. We merely saw them in our server logs. Users didn't see them. Well, mostly :)

Everything Is Traceable

badge

Since there is a communication board between us, it's very easy to see the entire history of our discussion, which is very intuitive. It's like a Slack chat history. You see what we started from, who said what, and which conclusions were made.

Basically, you can't have that visibility in a web UI. Well, you can probably create a special page with the "history of operations," but who would check it? And how visible and simple would that information be? And, what's most important, how would that information match up with the UI?

In the log, you'll state that "the build was started," but what's the build and how was it started? How can I start it again? Using which buttons and web controls? It's not clear.

Thus, the traceability of a chronological chat is unbeatable.

Easy to Integrate With Other Services

Yes, think about the future of this approach. If there is a centralized message board where users talk to a chatbot, why can't other chatbots talk to each other too?

Forget about RESTful APIs. Just a message board where chatbots post their requests and collect responses. They are perfectly decoupled, replaceable, and very scalable. Also, their communication protocol is visible and very traceable. And they boast many other benefits, as was just explained above. It's much more convenient for us, both users and programmers, to monitor them and create them.

Well, maybe it's too extreme to get rid of RESTful APIs entirely, but to some extent, this approach is feasible, I believe.

I didn't go too far with this idea, but something was done. We have a messaging platform that allows multiple chatbots to communicate with users. It's called Netbout. It's a very primitive web system with isolated discussions. Simply put, anyone can create a new discussion, invite a few friends, and post messages there. Both users and chatbots can do that.

So, when a new candidate wants to join Zerocracy, we ask that person to fill out an online form. When the candidate clicks the "Submit" button, a new discussion starts, and the first chatbot decides who should interview that person. The decision is made according to the skills listed in the form. The chatbot invites one of our best programmers to conduct the interview. When the interview is done, another chatbot explains to the candidate what the next steps are, registers him or her in our database, and starts to show the progress of work.

From a user perspective, it looks like he or she is talking to a few people who understand just a few simple commands. It is very intuitive and was easy to design.

I think chatbots are a good approach for interacting with microservices. Especially when users are more or less professional.

PS. Illustrations by Kristina Wheat.

© Yegor Bugayenko 2014–2018

Why Software Outsourcing Doesn't Work ... Anymore

QR code

Why Software Outsourcing Doesn't Work ... Anymore

  • Palo Alto, CA
  • comments

I want to create an iPhone app for my web service, but I don't have programmers. Well, I don't have iOS programmers. And I don't have money. Sound familiar? What do I do? Right, I go to Google Upwork and find an awesome company in Bangalore that is excited to work with me for nothing reasonable money. In a few months and after a few thousand dollars, I realize this is not exactly what I expected. After yet another few months, I swear to God I'll never outsource any software development to anyone. Is it just me? Not really.

The Godfather: Part II (1974) by Francis Ford Coppola
The Godfather: Part II (1974) by Francis Ford Coppola

This preamble is just a joke, but it's not so far from the truth. Of course, in bigger companies and bigger projects, the story will be different. But the outcome is almost always the same---it is a disaster.

I'm talking about outsourcing, not offshore development. The difference is that in outsourcing, there are two companies involved: you the client and some WeCodeLikeNoOneElse Inc. from Loompaland. In offshore development, you just open an office in that same Loompaland with your own management and employees. Again, I'm discussing outsourcing here.

Before writing this, I read a few dozen articles about why outsourcing fails, and I've found a dozen "reasons" why. However, I think they all miss the point, because they are looking at the problem from a paying customer's point of view. I try to look at it from both sides and tell you the ugly truth. More on that in a few paragraphs. For now, let's explore what the usual reasons are.

Cheapest Providers. Here is the argument: "So you're outsourcing because you want to optimize costs? You will end up with the cheapest software shop and sincerely regret it very soon." OK, what's the solution, then? Just pay more? I don't think that's going to solve the problem; I'll just burn more money. Also, I don't think this reason has anything to do with outsourcing specifically. In any other business transaction, a "win-lose" scenario is a straight path to failure.

Cultural Mismatch. "You're in California, and they are in Brazil; you won't understand each other." Is that why we have cost overruns, schedule slippage, and low quality of code? I don't think so. Moreover, my experience tells me the opposite. Our programmers at Zerocracy are from more than 15 countries, and we've never had cultural issues get tangled up in any work conflict, which we have a lot of.

Lack of Face-to-Face Talking. "They are far away somewhere in Poland, so you rarely really talk to them. That's why you misunderstand each other." Look, have you ever met me and had a face-to-face talk with me? I'm talking to you, the reader of this post. That's right, you haven't met me, but you're having no problem understanding my point just by reading this text. That's mostly because I've made all possible efforts to ensure my point is clear to you. I'm interested in delivering my thoughts to you, and it is happening. In outsourcing, the problem is not with the channel but with the motivation. Read on.

No Metrics to Measure Success. "You simply can't define clear metrics of success for a team overseas. That's why your relationship eventually falls apart." Or something like that. I didn't quite get what's meant by "metrics of success," but if it's what I think it is, they are right: Success for a software outsourcing shop in Kiev is one thing. Success for you, a client of this shop, is something very different. Read on.

Poor Specs. "It's just not possible to make good specifications for most projects, and a poorly designed spec is a recipe for failure." Yes, that's very true, but what does this have to do with outsourcing? Ah, right, they are so far away in Argentina and we're here in New York City---how can we make a good spec? I don't buy it. An inability to clearly and explicitly specify technical requirements is a flaw of the architect. Learning, training, and reading should fix this. Getting everybody together in the office is not a solution.

Leakage of Talent. "Developers offshore are not your employees. They will never be loyal to the project, and the best of them will quit once in a while." Yes, people may leave once in a while. But again, how is that related to their location? If they don't use the same coffee machine as their CEO, will they be less loyal to the project? There are many other more effective instruments to boost motivation in a team than just co-locating everybody.

There could be much more, but this is enough for us. As you see, I don't find these "reasons" logical. They merely explain the consequences but never even touch the real problem, which I believe sounds like this:

You're just a cash cow for an outsourcing company.

You're neither a partner nor a friend, despite all your expectations.

Your goals are opposite of their goals.

All these "reasons" for outsourcing failures originate in this fundamental confusion that exists in our heads: We think these 10 programmers sitting in Beijing are part of our business. We believe they are our team. They are with us in the same boat, sharing the same values and looking in the same direction.

It's just not true.

It can't be true.

I've been in the shoes of an outsourcing company for almost 10 years (and quit in 2010). The ugly truth is that for a CEO of an outsourcing shop, the only problem is how to take care of the next month's payroll, and 90 percent of all expenses are salaries for the programmers.

That's why a good customer for them is a paying customer. Not a customer with a successful project. Not a customer with a properly solved problem. Not a customer with optimized costs. Not a customer with the best possible technology utilized. Not at all. The best customer is the one that pays, pays a lot, and pays on time. Period.

That's the root cause of all problems with outsourcing.

The title of this article states that outsourcing doesn't work anymore. Why anymore? Did it work before? Yes, it did, when salaries of programmers were extremely lower in third-world (offshore) countries. For example, in 2001, we had a team of very good senior Java developers in Ukraine. We paid them above the market price, and it was $300 per month. At the same time, we were selling their time to U.S. customers for $15 per hour, which was $2,500 per month. See the margin?

With such a margin, outsourcing works fine. I was a CEO and had almost no worries about payroll. I had enough money to pay my team, even if we lost some customers eventually. Let me put it this way: I had the luxury to be honest with my customers.

Not anymore.

Put yourself in their shoes. Today, a good Java developer in Ukraine earns $4,000 a month. On top of that, this developer expects health insurance, a free gym membership, free lunch, paid vacation, paid sick leave, etc. At the same time, the price of Java time is not much higher on the market than it was years ago. Even if you charge $40 per hour (which is very unlikely), your income would be $6,800. Again, your income is $6.8K and your expenses are close to $5K. See the margin?

And don't forget about office expenses, taxes, computers, administrative staff, team building events, etc. Because of such a small margin, you will literally be broke if you lose a key paying customer. You just can't afford to keep your programmers "on the bench" for too long.

Thus, your lone motivation is to keep that cash flow coming. No matter what. The longer the project, the better. The lower the quality of code, the better---more money for maintenance. The more phone calls, meetings and other time-wasting events, the better. The more mess in specs, the better. Just do whatever it takes to suck money from the customer.

You have to do this, not because you're evil but because you have to feed your team. Your team. Yes, the team is yours. You're responsible for their salaries, not the customer. In order to protect the team, you have to go against the real interests of your customers. You simply can't be in the same boat with them.

The point of all this is that outsourcing simply can't work, because your business interests can't be aligned with the interests of your outsourcing "partner."

© Yegor Bugayenko 2014–2018

What Do You Do With InterruptedException?

QR code

What Do You Do With InterruptedException?

  • Palo Alto, CA
  • comments

InterruptedException is a permanent source of pain in Java, for junior developers especially. But it shouldn't be. It's a rather simple and easy-to-understand idea. Let me try to describe and simplify it.

Crouching Tiger, Hidden Dragon (2000) by Ang Lee
Crouching Tiger, Hidden Dragon (2000) by Ang Lee

Let's start with this code:

while (true) {
  // Nothing
}

What does it do? Nothing, it just spins the CPU endlessly. Can we terminate it? Not in Java. It will only stop when the entire JVM stops, when you hit Ctrl-C. There is no way in Java to terminate a thread unless the thread exits by itself. That's the principle we have to have in mind, and everything else will just be obvious.

Let's put this endless loop into a thread:

Thread loop = new Thread(
  new Runnable() {
    @Override
    public void run() {
      while (true) {
      }
    }
  }
);
loop.start();
// Now how do we stop it?

So, how do we stop a thread when we need it to stop?

Here is how it is designed in Java. There is a flag in every thread that we can set from the outside. And the thread may check it occasionally and stop its execution. Voluntarily! Here is how:

Thread loop = new Thread(
  new Runnable() {
    @Override
    public void run() {
      while (true) {
        if (Thread.interrupted()) {
          break;
        }
        // Continue to do nothing
      }
    }
  }
);
loop.start();
loop.interrupt();

This is the only way to ask a thread to stop. There are two methods that are used in this example. When I call loop.interrupt(), a flag is set to true somewhere inside the thread loop. When I call interrupted(), the flag is returned and immediately set to false. Yeah, that's the design of the method. It checks the flag, returns it, and sets it to false. It's ugly, I know.

Thus, if I never call Thread.interrupted() inside the thread and don't exit when the flag is true, nobody will be able to stop me. Literally, I will just ignore their calls to interrupt(). They will ask me to stop, but I will ignore them. They won't be able to interrupt me.

Thus, to summarize what we've learned so far, a properly designed thread will check that flag once in a while and stop gracefully. If the code doesn't check the flag and never calls Thread.interrupted(), it accepts the fact that sooner or later it will be terminated cold turkey, by clicking Ctrl-C.

Sound logical so far? I hope so.

Now, there are some methods in JDK that check the flag for us and throw InterruptedException if it is set. For example, this is how the method Thread.sleep() is designed (taking a very primitive approach):

public static void sleep(long millis)
  throws InterruptedException {
  while (/* You still need to wait */) {
    if (Thread.interrupted()) {
      throw new InterruptedException();
    }
    // Keep waiting
  }
}

Why is it done this way? Why can't it just wait and never check the flag? Well, I believe it's done for a good reason. And the reason is the following (correct me if I'm wrong): The code should either be bullet-fast or interruption-ready, nothing in between.

If your code is fast, you never check the interruption flag, because you don't want to deal with any interruptions. If your code is slow and may take seconds to execute, make it explicit and handle interruptions somehow.

That's why InterruptedException is a checked exception. Its design tells you that if you want to pause for a few milliseconds, make your code interruption-ready. This is how it looks in practice:

try {
  Thread.sleep(100);
} catch (InterruptedException ex) {
  // Stop immediately and go home
}

Well, you could let it float up to a higher level, where they will be responsible for catching it. The point is that someone will have to catch it and do something with the thread. Ideally, just stop it, since that's what the flag is about. If InterruptedException is thrown, it means someone checked the flag and our thread has to finish what it's doing ASAP.

The owner of the thread doesn't want to wait any longer. And we must respect the decision of our owner.

Thus, when you catch InterruptedException, you have to do whatever it takes to wrap up what you're doing and exit.

Now, look again at the code of Thread.sleep():

public static void sleep(long millis)
  throws InterruptedException {
  while (/* ... */) {
    if (Thread.interrupted()) {
      throw new InterruptedException();
    }
  }
}

Remember, Thread.interrupted() not only returns the flag but also sets it to false. Thus, once InterruptedException is thrown, the flag is reset. The thread no longer knows anything about the interruption request sent by the owner.

The owner of the thread asked us to stop, Thread.sleep() detected that request, removed it, and threw InterruptedException. If you call Thread.sleep(), again, it will not know anything about that interruption request and will not throw anything.

See what I'm getting at? It's very important not to lose that InterruptedException. We can't just swallow it and move on. That would be a severe violation of the entire Java multi-threading idea. Our owner (the owner of our thread) is asking us to stop, and we just ignore it. That's a very bad idea.

This is what most of us are doing with InterruptedException:

try {
  Thread.sleep(100);
} catch (InterruptedException ex) {
  throw new RuntimeException(ex);
}

It looks logical, but it doesn't guarantee that the higher level will actually stop everything and exit. They may just catch a runtime exception there, and the thread will remain alive. The owner of the thread will be disappointed.

We have to inform the higher level that we just caught an interruption request. We can't just throw a runtime exception. Such behavior would be too irresponsible. The entire thread received an interruption request, and we merely swallow it and convert it into a RuntimeException. We can't treat such a serious situation so loosely.

This is what we have to do:

try {
  Thread.sleep(100);
} catch (InterruptedException ex) {
  Thread.currentThread().interrupt(); // Here!
  throw new RuntimeException(ex);
}

We're setting the flag back to true!

Now, nobody will blame us for having an irresponsible attitude toward a valuable flag. We found it in true status, cleared it, set it back to true, and threw a runtime exception. What happens next, we don't care.

I think that's it. You can find a more detailed and official description of this problem here: Java Theory and Practice: Dealing With InterruptedException.

© Yegor Bugayenko 2014–2018

Software Quality Award, 2016

QR code

Software Quality Award, 2016

  • comments
badge

This is the second year of the Software Quality Award. The prize is still the same---$4,096. The rules were changed a bit. Read on. BTW, 2015 is here.

Rules:

  • One person can submit only one project.

  • Submissions are accepted until the September 1, 2016.

  • I will check the commit history to make sure you're the main contributor to the project.

  • I reserve the right to reject any submission without explanation.

  • All submissions will be published on this page (including rejected ones).

  • Results will be announced October 15, 2016 on this page and by email.

  • The best project will receive $4,096.

  • Final decisions will be made by me and are not negotiable (although I may invite other people to help me make the right decision).

  • Winners that received any cash prizes in previous years can't submit again.

Each project must be:

  • Open source (in GitHub).

  • At least 10,000 lines of code.

  • At least one year old.

  • Object-oriented (that's the only thing I understand).

The best project is selected using this criteria.

What doesn't matter:

  • Popularity. Even if nobody is using your product, it is still eligible for this award. I don't care about popularity; quality is the key.

  • Programming language. I believe that any language, used correctly, can be applied to design a high-quality product.

  • Buzz and trends. Even if your project is yet another parser of command line arguments, it's still eligible for the award. I don't care about your marketing position; quality is all.

By the way, if you want to sponsor this award and increase the bonus, email me.


60 projects submitted so far (in random order):

15 Oct 2016: I asked one of our Java developers to do a preliminary analysis of all projects. This is the report he sent me back today: award-2016.txt (you can find your project there). I added my comments to his pluses and minuses (see them right in that text file). Based on his opinion and a preliminary analysis I picked these few finalists:

  • pholser/junit-quickcheck (Java)
  • NullVoxPopuli/aeonvera (Ruby)
  • SimonKagstrom/kcov (C++)
  • skinny-framework/skinny-framework (Scala)
  • paypal/squbs (Scala)
  • ben-manes/caffeine (Java)
  • coala/coala (Python)

I'm sorry for being late, but I need a few more days to analyze them properly and decide which one gets the prize. I will announce the winner on 21st of October, in six days. I will email everybody and publish my decision here.

18 Oct 2016: This is my analysis of that seven finalists. I tried to pay as much attention to each project as possible (they all are rather good).

pholser/junit-quickcheck (19K LoC, 80K HoC)

  • changes are not really traceable, for example this commit has no link to any GitHub issue, why was it made?
  • GitHub releases are there, but I didn't find how exactly they are made, where is the release procedure? Also, they are not documented anyhow, there are no relase notes.
  • There are some utility classes, for example Lists, Items, and Sequences.
  • There are some -ER classes, for example Shrinker and SampleSizer (which has code-rich constructor).
  • The design of "generators" is not really object-oriented, I guess. They all are providers of procedures, not really "objects" in terms of OOP.
  • Score: 5

NullVoxPopuli/aeonvera (46K LoC, 835K HoC)

  • "Vendor" assets are right in the GitHub repo, it's a bad practice (including jQuery)
  • It's Ruby on Rails, which is MVC, which is not really OOP. And there is also an ORM with anemic model (in models/ dir). Aside from that, "serializers", "services", "validators", etc. -- not realy an OOP.
  • I didn't find any GitHub releases
  • There is no official release procedure in the repo. I simply can't understand how this product goes to production, the process is not automated (or the script is not in the repo). It's a serious problem for a web app.
  • Aside from that, the app is definitely cleaner than many other similar RoR web apps. Good job.
  • Score: 4

SimonKagstrom/kcov (15K LoC, 50K HoC)

  • I didn't find any static analysis, although there are many open source tools for C/C++.
  • Some pieces of code are too complex methods, for example at ptrace.cc, elf-parser.cc, or html-writer.cc.
  • There are some -ER classes, like "writers", "verifiers", and "parsers", for example: AddressVerifier or DwarfParser.
  • Interfaces are prefixed with I, which is a bad practice in OOP, see reporter.hh.
  • There are some "utils", which I would replace with classes (most of them): utils.hh.
  • Score: 4

skinny-framework/skinny-framework (44K LoC, 191K HoC)

  • It seems that the owner of the repo is making commits directly to the master branch, without any issues/pull requests, for example.
  • I didn't find any static analysis.
  • I didn't find code coverage control, although there are many unit tests.
  • There are utility classes, for example StringUtil, DateTimeUtil.
  • It's MVC, which is an anti-pattern in OOP.
  • There is ORM, which also is an anti-pattern.
  • The complexity of the code sometimes is very high, for example this file: AssociationsFeature.scala
  • Score: 2

paypal/squbs (28K LoC, 163K HoC)

  • Where is static analysis?
  • Exception swallowing seems to be a common practice here, for example: here, here, here, etc.
  • There are utility classes, for example ConfigUtil.
  • Overall impression is that the complexity is rather high, for example these files are really hard to understand: ServiceRegistry.scala or UnicomplexBoot.scala.
  • Score: 3 (mostly due to complexity)

ben-manes/caffeine (47K LoC, 191K HoC)

  • Some files are very long, for example BoundedLocalCache.java (3300 LoC!). Aside from a few super long files, there are many .java files with 300+ LoC; it's too much and complexity is high because of that. The code looks clean, but very complex.
  • There are some -ER classes, for example Weigher can easily be renamed to Weight. Same for "loaders" and "writers."
  • Score: 4

coala/coala (15K LoC, 274K HoC)

  • This project was a finalist last year.
  • There are some -ER classes, like "parsers", "collectors", "importers."
  • I still don't see a consistent code formatting, it's really strange to see so many spaces in front of some lines, placed there without any logic (at least I can't see that logic, for example).
  • Some code is pure global, without any classes, like in Processing.py and in Collectors.py (maybe it's inevitable, but still looks strange).
  • Some files are rather long, like Lynter.py or ConsoleInteraction.py.
  • Score: 5

I can't find a strong single leader... Let me think about it.

23 Oct 2016: I didn't find a single leader this year and the prize goes to two projects: pholser/junit-quickcheck and coala/coala ($2,048 to each one).

Congratulations to @pholser and @sils, the winners!

Here are your badges:

winner   winner

Put this code into GitHub README (replace ??? with your GitHub name in the URL):

<a href="http://www.yegor256.com/2015/10/17/award-2016.html">
  <img src="//www.yegor256.com/images/award/2016/winner-???.png"
  style="width:203px;height:45px;" alt='winner'/></a>

Thanks to everybody for participation! See you next year.

© Yegor Bugayenko 2014–2018

What Is the Difference Between Ridley Scott and Joseph Goebbels?

QR code

What Is the Difference Between Ridley Scott and Joseph Goebbels?

  • Palo Alto, CA
  • comments

I saw The Martian this weekend, and it triggered a few thoughts. Of course, I didn't like the movie as a piece of art. It is total garbage, but this is not my point. There is something bigger to discuss, aside from the bad acting, primitive story-line, politically correct but absolutely unrealistic casting, and tons of logical inconsistencies. It's Hollywood; what should I expect, right? Not just that. I think the problem is bigger.

Cossacks of the Kuban (1950) by Ivan Pyryev
Cossacks of the Kuban (1950) by Ivan Pyryev

Have any of you seen this movie: Cossacks of the Kuban? It was shot in 1949, when Joseph Stalin was in power, the Soviet Union was literally broke, and WWII brought people to the point of starvation. However, the film showed something completely opposite---wealthy villages, rich peasants, and tables full of food.

It was propaganda in 1949.

But isn't it quite similar to what I've just seen a few days ago, produced and directed in 2015 by Ridley Scott?

badge

In 1949, the goal of Soviet propaganda was to convince people that their personal situations with a lack of food and lack of future were just their local, personal exceptions to a more general rule. And that rule was that the country was full of food. The country was governed by the principles of socialism, and they were working perfectly.

In 2015, the goal of Hollywood propaganda is to convince us that the organizational and motivational problems in our offices are just local exceptions to the general rule. The rule is simple: project management is not important if we're all good friends.

Ridley Scott is telling us that in a perfect organization, such as NASA, everybody loves everyone; that's why they can get a man from Mars without even a map. Do the same in your company and you will be fine. You don't need risk planning, you just need a hero. Actually, you'd be better off with a couple of heroes who love each other.

That doesn't work for you? It must be a problem with implementation. Keep trying and smiling.

Make friends, don't make plans.

It is a lie, very similar to the lie we heard in 1949.

The truth is that you are not going to get anywhere if you follow the spirit of this movie. In reality, teamwork must look completely different. There are conflicts, fights, politics, betrayals, back-stabbing, leakage of information, and just primitive incompetence. To manage all this, one can't just be a nice guy with a big heart. I would even say that being a nice guy is a drawback for any management position in a modern organization. Well, in any organization at any time and in any place.

Project management is not about compassion and sympathy. It is about accurate and routine comparison of risks, probabilities, impacts, and their mitigation plans. It is about setting rules and making decisions. It is about making sure these decisions are being executed, precisely and without mistakes. It is about making sure those who've made mistakes are punished while those who've done everything right are rewarded.

A team of six. In a multi-million-dollar spaceship. Flying to another planet to save one person. Against explicit instructions from upper management. They come back as national heroes. Are you serious?

Have you tried to deploy a new feature on a production server against the direct will of your boss? Try it. No spaceships, no Mars. Just a piece of code and a simple server. Then try to convince your boss that you're a hero.

I'm sure you get the point.

Shooting The Martian (by popsci.com)
Shooting The Martian (by popsci.com)

So, why is Ridley Scott lying to us? Why is he giving us a false picture of reality? Intentionally false. He knows better than I do how real management works in real-life organizations. Hollywood is not much different than Silicon Valley in this aspect. So, why is he lying?

badge

Why was Joseph Goebbels, a minister of propaganda in the Third Reich, lying to the German people?

Because that's what we like to hear, unfortunately.

It is sad, but we don't want to know the truth. We didn't want to know about Nazi war crimes---so Joseph Goebbels built a fake reality for us. We don't want to know about the true principles of management---so Ridley Scott built fake ones for us.

Think about it.

© Yegor Bugayenko 2014–2018

Competition Without Rules Is Destructive

QR code

Competition Without Rules Is Destructive

  • Moscow, Russia
  • comments

When your team has to choose which technical decision to make, who has the final say? When one of your colleagues asks for a raise, who decides, and what is his or her decision based on? When it's necessary to work overtime, how is it decided who will stay in the office? I'm expecting you to shrug your shoulders. You're right, these questions never have explicit answers in modern organizations. We are used to working in a more "democratic" way, where such decisions are made subjectively by managers or more senior employees. Is this how it should be?

The Wrestler (2008) by Darren Aronofsky
The Wrestler (2008) by Darren Aronofsky

We are trying to avoid explicitness in these sensitive subjects. Indeed, how can we tell Jeff that his salary is lower than Monica's because his performance is worse? This will definitely lead to depression and negativity within the team, right?

What I'm trying to say is that we don't set rules. We think that strict and explicit rules related to performance offend creative people. Well, all people.

We avoid explicitness in performance appraisals.

And this is totally wrong!

This is a mistake, and it causes big problems!

When a group doesn't have explicitly defined principles of survival and growth, it starts to create them naturally. When people don't know what exactly needs to be done in order to get a 15 percent raise, they find a way to get this information anyway. And guess what this information will end up being? Right---you have to make your boss happy; that's how your chances for a raise improve.

Instead of working toward the goals set by the organization, we are fighting with each other for the attention of our boss. Instead of focusing on the results and their quality, we are reading the mood of our manager. Our fear becomes a guide for us.

Competition is inevitable in a group, especially if the group consists of creative people. Creativity is all about competition. Each of us wants to be better than the others, and this is what drives innovation. But if the team has no rules, even a minor competition seriously and negatively affects their motivation.

If you want your team to be creative and productive, clearly and explicitly define the rules of competition. Make sure everyone can get clear and straightforward answers at any time to questions like these:

  • Who is the best developer on our team?

  • Why is my salary lower than Jeff's?

  • What do I need to do in order to get a raise?

  • Under what conditions will I be fired?

Can you ask these questions in your team and easily get explicit answers?

© Yegor Bugayenko 2014–2018

How to Be a Good Office Slave

QR code

How to Be a Good Office Slave

  • Palo Alto, CA
  • comments

This is a short manual for you, my friend. I assume you are sitting in the office right now, reading this blog post. Maybe you don't like your office job, or maybe you enjoy it and feel excited to be close to your office friends. It doesn't matter. What matters is that there is always an alternative to office slavery. I'm not talking about starting your own business. There are people in this world who work for someone without doing what is described below. They do exist, as well as companies that don't turn their employees into slaves. I really hope you will eventually find one. In the meantime, this manual is for you :)

The Office (2001–2003)
The Office (2001–2003)

Help Others. Find the stupidest newbies and help them. Regardless of what exactly you help them with, they should rely on you. Show them where the restroom is, recommend a good restaurant nearby, assist in an IDE installation, explain how the project works, and make standard jokes about the worst class in it. They must become your best friends---and not only them. Be helpful to everybody. No matter what is happening, everyone must know that you're ready to help. Ideally, they all must depend on your kindness and readiness to save them from the chaos around.

Be the Last to Leave the Office. Nothing annoys a manager more than an employee who leaves the office at 5 p.m. sharp. It's a sign of disrespect. Don't you like it here? Is there anything in this life more important to you than this job? There shouldn't be. Demonstrate that by staying late. Here is a simple trick: just come later. The boss won't blame you for that. But always stay there after everybody else is gone. Ideally, you should leave right after the boss. Overtime is a clear sign of your loyalty to our mutual results.

Don't Nag. No matter what is happening, you should never criticize your direct manager. The boss is always right. Everything else may be wrong---the situation, colleagues, suppliers, computers, the CEO, investors, the market, or the weather, but not the boss you directly report to. The word of this person is the law. The boss is the god. Ideally, you should be the prophet. No matter what the boss says, you deliver it to others. And you must look like you sincerely believe that it's right.

Attend All Meetings. No matter what they are about, you must be there. And don't just be present; actively participate. It's not so difficult, and very soon you will start to understand what they are talking about and will be able to say something, even if you had no idea about the subject beforehand. Eventually, everybody will start thinking they must ask your permission in order to make some decision, because you were at that meeting. Important people don't write code; they attend meetings. Remember that.

Turn Down Recruiters, Publicly. Loyalty! That's what matters to a real team. When a recruiter calls you, raise your voice and explain that you're happy in this company and don't want to move on, ever. The more people who hear you, the better. Also, you can sometimes tell stories about offers you're getting and how you turn them down. Your boss should be the main audience for these stories. Why do you turn them down? Not because they are bad, but because your life belongs to this company. Loyalty is what makes you a good slave; don't forget it!

Don't Take Sides. It's just too risky. In any argument, you can always find pros and cons for both sides, right? So why support one of them? You may be wrong and lose respect in front of everybody. Why take that chance? Instead, always say that there are drawbacks to both options. That's what a wise man would say, anyway. There is no absolute truth in this world. That's why you should always stay in the middle, where you will never be wrong. Well, until your boss takes one of the sides. That's the right moment to agree and follow.

Never Ask for a Raise. It should be absolutely clear to everybody that you don't work for money. You work for the big idea. Period.

Attend All Social Events. Birthdays, corporate parties, Halloween, Friday beers---you must be there, always. Don't worry about wasting your life; you will like them eventually. It is very important to demonstrate that you truly live in the office. You are not just writing code, taking money, and going home to your family. Absolutely not! The office is your real family, and you truly enjoy eating pizza with your boss and listening to his childhood stories. That's how you demonstrate your loyalty, which is the best quality of a good slave.

Point Fingers Privately. Don't say anything bad about anyone in public. No matter who is doing what, we're always a team; we're together. Together! This should be your main keyword when talking about results, problems, and risks. Never blame anyone---publicly. However, when you're talking in the kitchen with a few of your most trusted colleagues, let yourself go. Tell them who you think is the weakest part of the team and what you would do with him or her if you were the boss. Don't restrict yourself, but always make sure there are only a few people who can hear you.

Added 8-Oct-2015:
Never Ask for Vacation. When the time is right, your boss will inform you that you can go on vacation. He is the one who knows when it is suitable for the company to have you away for a few days. It can't be in the middle of a project obviously, nor in the beginning, and definitely not near the end of it. It is usually matched with popular vacation periods in a year (e.g. Christmas, New Year). That might be a bit more expensive for you, but reward of not betraying the company is priceless. If you do make a mistake of asking for a vacation, try to make it short. The worst thing you can do while trying to extend vacation days is to mention that you could "still work while out of the office"---that will immediately get you in a position where you don't need the office, hence the office doesn't need you.

Added 30-Jun-2016:
CC Your Boss. Add your boss in a CC for as many emails as possible. The more emails that come from you, the more valuable you are. Your boss must see that you're actively involved in many communications and that it's simply impossible to replace you. Besides that, CC-ing the boss is a sign of respect. She or he will never forget that.

Sigh, Don't Laugh. You must look very concerned about situations regarding the project, the team, the management, the office space, and everyone's future. If you're not concerned and laugh about it, how can you be trusted? That's a clear sign that you're not taking your job seriously, and who knows what you will do tomorrow. Don't be like that. Instead, always look a bit sad. God forbid you look happy in front of the boss.

Care About Everything. No matter what the discussion is about, you care about the subject. There is nothing involving the team that doesn't bother you. You must show that you feel responsible for every problem and each task. Also, when there is a discussion in the office and someone is doing his own thing, paying no attention to the subject, you should ask, "Doesn't this concern you at all?" Make him feel guilty for not being careful enough---that will give you a lot of points for "being on top of all things."

Look Tired. Always look a bit tired, as if you were working all night and barely got a few hours of sleep. Also, try to make it obvious that you were fixing some old bug in the system that nobody except you really cares about. You must not look too energetic---this disrespects your boss. He didn't give you enough work to wear you out completely? That means he is a bad manager. Instead, you should even joke that "our boss knows how to keep us busy." That flattery will definitely please even a smart person.

Added 13-Aug-2017:
Have No Hobbies. It's very annoying to see you enjoying something else aside from the job in the office. Your happiness and your heart must be here, at the work desk. Not somewhere else, at a snow slope or a tennis court. Sport, dancing, singing, gaming, painting, even open sourcing---are the activities you must hide. You are betraying your boss and your company if anything aside from the project is making you happy.


If you follow all these rules, you won't be fired, ever. Well, until the company is bankrupt, that is. If it's a startup, it will go bankrupt for sure, thanks to you and people like you. If it's a big enterprise, it probably won't, unfortunately. You will be safe, and your resume will have an impressive "12 years at Oracle" statement. Well, that's an achievement, isn't it?

I don't think so.


If you like this article, you will definitely like these very relevant posts too:

How Do You Punish Your Employees?
A sarcastic overview of different types of abusive and manipulative behavior a bad manager may expose to office employees.

Hourly Pay Is Modern Slavery
Paying by the hour is a modern form of slavery that must be replaced by paying for results, which is a much more effective and healthy form of management.

Making Your Boss Happy Is a False Objective
It is very important to understand who you work for, the boss/customer who pays you or the project, and the difference between them.

© Yegor Bugayenko 2014–2018

Vertical and Horizontal Decorating

QR code

Vertical and Horizontal Decorating

  • Moscow, Russia
  • comments

A decorator pattern is one of the best ways to add features to an object without changing its interface. I use composable decorators quite often and always question myself as to how to design them right when the list of features must be configurable. I'm not sure I have the right answer, but here is some food for thought.

The Apartment (1960) by Billy Wilder
The Apartment (1960) by Billy Wilder

Let's say I have a list of numbers:

interface Numbers {
  Iterable<Integer> iterate();
}

Now I want to create a list that will only have odd, unique, positive, and sorted numbers. The first approach is vertical (I just made this name up):

Numbers numbers = new Sorted(
  new Unique(
    new Odds(
      new Positive(
        new ArrayNumbers(
          new Integer[] {
            -1, 78, 4, -34, 98, 4,
          }
        )
      )
    )
  )
);

The second approach is horizontal (again, a name I made up):

Numbers numbers = new Modified(
  new ArrayNumbers(
    new Integer[] {
      -1, 78, 4, -34, 98, 4,
    }
  ),
  new Diff[] {
    new Positive(),
    new Odds(),
    new Unique(),
    new Sorted(),
  }
);

See the difference? The first approach decorates ArrayNumbers "vertically," adding functionality through the composable decorators Positive, Odds, Unique, and Sorted.

The second approach introduces the new interface Diff, which implements the core functionality of iterating numbers through instances of Positive, Odds, Unique, and Sorted:

interface Diff {
  Iterable<Integer> apply(Iterable<Integer> origin);
}

For the user of numbers, both approaches are the same. The difference is only in the design. Which one is better and when? It seems that vertical decorating is easier to implement and is more suitable for smaller objects that expose just a few methods.

As for my experience, I always tend to start with vertical decorating since it's easier to implement but eventually migrate to a horizontal one when the number of decorators starts to grow.

© Yegor Bugayenko 2014–2018

You're Just the Mayonnaise in a Bad Sandwich

QR code

You're Just the Mayonnaise in a Bad Sandwich

  • Moscow, Russia
  • comments

That's what a character played by actor Bruce Willis said to Robert DeNiro's movie producer character in Barry Levinson's brilliant film What Just Happened. I second that. Producers, recruiters, managers, real estate agents, sales agents, lawyers, and outstaffers---what do they all have in common? They are middlemen standing between money and the proletariat, taking a huge percentage for themselves but adding no value. Their very existence is our mutual misfortune. We are too weak to get rid of them now, but sooner or later every supply chain will be mayonnaise-free. Look at Uber---taxi companies are dead already, and we now have only drivers and passengers with a computer system in between. The same will happen everywhere else.

What Just Happened (2008) by Barry Levinson
What Just Happened (2008) by Barry Levinson

Seriously, look at IT recruiters, for example. To find a programmer, one has to pay about $30K (in Silicon Valley, if a programmer's salary is, say, $120K a year) to a recruiter. $30,000! What will this money be spent on? Or let me put it this way: How much software will I get for this money? Let me put it even better: Why don't I give this money to the programmer directly as a bonus for switching companies? Why do we need this recruiter between us---me and the programmer I'm going to hire? Can't we use this $30,000 more effectively?

Because software systems are not powerful enough yet? Because I can't find a programmer with a few clicks and I have to delegate this search function to someone for $30K?

Well, yes and no.

On one hand, there are plenty of job sites and rather powerful technologies for finding the right person. There is StackOverflow Careers, which not only allows me to find a programmer but also see what he or she talks about and the quality of his or her questions and answers. There is GitHub that demonstrates the code written by a programmer, helping me easily understand its quality. There are professional certifications that show how strong a candidate's skills are. And there are plenty of other avenues.

On the other hand, these tools are not actively used by the majority of programmers and software companies---mostly because IT recruiters stay between us, stealing our money and protecting that position for themselves. Just like taxi companies remain between passengers and drivers, or real estate agents get in between house owners and house buyers, or outstaffing companies squeeze in the middle of project sponsors and engineers.

Imagine if there was no Google and you had to hire a "researcher" every time you needed to find some information. That's how it worked 50 years ago. Not anymore. Google solved the problem of information discovery. It is fast, it is accurate, and it is free. The researchers are out of business. Are we sorry about it? Well, maybe, but that's the way it should be. The same will happen with IT recruiters and all those "agents." They will eventually be out of business and will start doing something that actually adds some value for all of us.

At the moment, they are simply taking away our money, exploiting the fact that we're lazy, or stupid, or shy, or you name it. For example, it was not obvious in the beginning how to use Google. I know a few people who still don't know how to do it. I'm sure you know a few, too. They would rather call a friend when they need information than pull up Google.

Say I'm a good friend and Jeff calls me to ask what the weather will be like tomorrow in California. I'll advise him to Google it, now and every time in the future. I will teach him how to do it. But if I was a lousy friend and wanted Jeff to depend on me forever, I would just browse over to Google, check the weather, and tell Jeff it'll be cloudy.

That's exactly what recruiters are doing. Their entire business is based on the fact that we're not smart enough to use existing software systems, publicly available and in most cases very cheap or simply free. Or we're too shy to apply for a new job ourselves. Or we just don't know how to write a good resume and emphasize our skills properly. They are exploiting our weaknesses to make money.

A friend of mine was looking for a house in San Francisco a few months ago. He actually found the house on Zillow but paid $70,000 to two real estate agents to help him close the deal (the price of the house was close to $1.4 million, with 2.5 percent to each agent). What did these "hard-working" people do to earn his $70,000? They prepared the necessary paperwork and, of course, talked to him for a few weeks.

Can't we get rid of these two good-for-nothings and delegate their operations to a computer system? Well, we have Zillow, but how much of my friend's $70,000 found its way to Zillow? Almost nothing (I assume one of the agents paid a few pennies to publish an ad there). Is that fair? Let's instead give $5,000 out of every real estate transaction to Zillow and let it handle everything, automatically. Without any "agents" involved. Can we? I'm sure we can, and that's the future.

What will the army of real estate agents do? Well, maybe something useful, like cleaning streets.

The very existence of this mayonnaise in our modern business environment is a very negative thing. Money is simply not working the way it should. Also, since this mayonnaise is rather expensive, its existence creates a very de-motivating effect on those who are actually delivering value while making a much smaller income. It obviously demonstrates that the entire system is defective and simply not fair.

The same is true about outstaffing companies, which we contract with to gain access to programmers sitting somewhere overseas or much closer. They find developers, hire them full-time, and resell their skills with a 150 percent or greater margin on a part-time or short-term basis. I've been getting a few offers from such companies every day.

They want me to pay, say, $40 for each hour while a developer sitting in their office gets like $2,500 per month. This means $25 out of my $40 will be spent not writing code but rather on something else. Also, a programmer will be motivated for the $2,500, not for $7,000. So I will be paying $7K per month and getting software worth $2,500 a month.

I will be paying for a Mercedes-Benz S-Class but getting a Ford Focus. I'm not greedy; I just want my every dollar to be converted into some value. In this scenario, $4,500 will be simply wasted.

The same is true about every single middleman in the market. They make business processes less effective, take away significant amounts of money, and slow down optimizations and innovations. A truly modern and innovative way of doing business is by directly connecting money and people who add value. There should be no one in between except computer systems.

Sometimes I hear the comment that people love to work with people, not computers. That's why we need all that mayonnaise---to make our life happier? It's true that people love to deal with people---people we really need, people who speak the same language, and people who deliver real value. Not with producers, recruiters, real estate agents, sales agents, outstaffers, lawyers, travel agents, investment brokers, executive officers, or taxi dispatchers.

The point of Bruce Willis's character is that when the sandwich is bad you don't fix it with a mayonnaise. It won't help, but only make things worse.

© Yegor Bugayenko 2014–2018

Are You a Micromanager?

QR code

Are You a Micromanager?

  • Moscow, Russia
  • comments

Micromanagement, according to Wikipedia at the time of this writing, is "a management style whereby a manager closely observes or controls the work of subordinates or employees." Everyone knows micromanagement is evil, but what could be wrong with closely observing or controlling people's work? Nothing. Observing and controlling is not what's so bad about micromanagement. It is something completely different.

Office Space (1999) by Mike Judge
Office Space (1999) by Mike Judge

There are tons of articles about micromanagement. Most of them emphasize that the "micro" prefix prescribes the size of the tasks being managed, meaning a good manager doesn't care about the small stuff while a micromanager employs "excessive control or attention to details," as Merriam-Webster says.

It seems that in order to become a good manager, one should just stop paying attention to details. Huh? What could be worse than a manager who doesn't pay attention to details?

Micromanagement has nothing to do with the details observed or the amount of control a manager exerts over subordinates. Instead, it is all about how the details are observed and control is exercised. A micromanager gives instructions while a good manager defines goals and rules.

Micro-managers define algorithms for achieving results and insist on them being implemented according to their will. This is what a micromanager would sound like:

- Could you please stop what you're doing now
  and install Nginx on a new server? I beg you,
  don't do anything else until it's done.

This is how a good manager would delegate a similar task:

- Hey, the server with Nginx configured must
  be up and running by 6 p.m. I'm counting on you.

Pay attention to how polite our micromanager is and how rude the good manager is. However, it's obvious that the first one is extremely annoying while the second doesn't irritate us at all. Because it's all about how the task is defined---as an algorithm or as a goal with rules.

Micro-managers treat me as a dumb executor of their will. A micromanager is imperative. A good manager, on the other hand, is declarative. A good manager declares what needs to be done, never telling me how I must achieve it.

By the way, there is---surprisingly---a lot in common between management and object-oriented programming :) Good object-oriented programming is also declarative, not imperative.

Thus, this "micro" prefix is not really about the size of the tasks a manager keeps under control. It is about what a manager wants to see inside them---a black box or a glass box under a microscope.

A good manager doesn't care about what I'm doing now, what tasks I'm working on, or what my plans, problems, and risks are. Instead, a good manager cares about my results, to a very specific level of details. A good manager pays extreme attention to defining quality standards for my work, clearly explaining expectations to me, and explicitly defining the rules of failure and success. A good manager makes the path ahead of me very clear. With a good manager, I know exactly what results are expected and what will happen if I fail or succeed.

Thus, to be a good manager, you should never tell your subordinates how to complete their tasks. Instead, you should define what solutions and results are expected. And, of course, what will happen in the case of success or failure.

© Yegor Bugayenko 2014–2018

How to Fire Someone Right

QR code

How to Fire Someone Right

  • Kiev, Ukraine
  • comments

A friend of mine asked me today, "How should I fire someone the right way? What are the tricks to do it nicely, gracefully, and professionally?" I responded by saying that if you question yourself about how to do it right, you're doing it wrong in the first place. If firing is a painful and unpleasant process for you, there is a problem with your management model. Firing must be an easy and open procedure, visible and understood by the entire team.

Up in the Air (2009) by Jason Reitman
Up in the Air (2009) by Jason Reitman

I mentioned this problem before in my post about team morale. I said that firing someone should not be done behind closed doors. Instead, the assessment of individual performance should occur openly and be visible to the entire team. If you need to close the door in order to talk "privately" to express your concerns and eventually to fire someone, you are a bad manager.

I also explained some time ago that a perfectly managed team is working for the project, not for you, the boss. The team must share the same goal and work towards it. The boss (or CEO, CTO, project manager, Scrum master, team lead, etc.) is there in order to enforce the rules accepted by the team. The team agrees to the rules, so the boss is just making sure they are enforced.

If firing is unpleasant for you, the rules are not clear.

If the rules are not clear, you're a bad manager.

The firing is unpleasant only when your decision is not supported by the team. You feel you're doing something wrong to the person you are firing and to the people who stay on the team. You feel it only because you don't have enough support from your team. You're acting as a dictator, not a true leader.

The firing decision should not be your decision. It should be derived from the rules your team agreed to work with. You should not fire when you don't like the person. Instead, you should fire when the person doesn't comply with the rules, like when there's a lack of performance.

When the rules are clear, everybody understands them, and reconciliation of performance is done regularly and openly, everybody will understand your firing decision and support it---including the person you're firing! Because it won't be your decision, but rather a decision logically derived from the rules. You will work for the project, not for your emotions or personal feelings.

By firing a person who is causing problems for the project, you will be doing a good thing for everybody---the project, the team, and the person who will go and find another place for his or her skills and talents.

Let me reiterate: If firing is unpleasant, there is a problem with the manager and the management.

© Yegor Bugayenko 2014–2018

When Do You Stop Testing?

QR code

When Do You Stop Testing?

  • Moscow, Russia
  • comments

There is a software to be tested. There is a team of testers. There is some money in the budget. There is some time in the schedule. We start right now. Testers are trying to break the product, finding bugs, reporting bugs, communicating with programmers when necessary, doing their best to find what's wrong. Eventually they stop and say "we're done." How do they know when to stop? When there is enough testing? It's obvious---when there are no more bugs left and the product can be shipped! If you think like this, I have bad news for you. You're fundamentally wrong.

La fille sur le pont (1999) by Patrice Leconte
La fille sur le pont (1999) by Patrice Leconte

All this is perfectly explained by Glenford Myers in his great book The Art of Software Testing. I will just summarize it here again.

badge

First, "testing is the process of executing a program with the intent of finding errors" (page 6). Pay attention, the intent is to find errors. Not to prove that the product works fine, but to prove that it doesn't work as intended. The goal of any tester is to show how the product can be broken, how it fails on different inputs, how it crashes under stress, how it misunderstands the user, how it doesn't satisfy the requirements. This is why Dr. Myers is calling testing "a destructive, even sadistic, process" (page 6). This is what most testers don't understand.

Second, any software has an unlimited amount of bugs. Dr. Myers says that "you cannot test a program to guarantee that it is error free" (page 10) and that "it is impractical, often impossible, to find all the errors in a program" (page 8). This is also what most testers don't understand. They believe that there is a limited number of bugs, which they have to find and call it a day. There literally no limit! The amount of bugs is unlimited, in any software product. No matter how small or big, complex or simple, new or old is the product.

Having these axioms in mind, let's try to decide when testers have to stop. According to Dr. Myers, "one of the most difficult questions to answer when testing a program is determining when to stop, since there is no way of knowing if the error just detected is the last remaining error" (page 135).

They can't find all bugs, no matter how much time we give them. And they are motivated to find more and more of them. But at some point of time we must make a decision and release the product. Looks like we will release it with bugs inside? Yes, indeed! We will release a product full of bugs. The only question is how many of them were found already and how critical they were.

Let's put it all together. There are too many bugs to be able to find all of them in a reasonable amount of time. However, we have to release a new version, sooner or later. At the same time, testers will always tell us that there are more bugs there and they can find more, just need more time. What to do?

Dr. Myers says that "since the goal of testing is to find errors, why not make the completion criterion the detection of some predefined number of errors?" (page 136). Indeed, we should predict how many bugs are just enough to find, in order to have a desirable level of confidence that the product is ready to be shipped. Then, ship it, consciously understanding that it still has an unlimited amount of not yet discovered bugs.

badge

David West in Object Thinking says that "software is released for use, not when it is known to be correct, but when the rate of discovering errors slows down to one that management considers acceptable" (page 13).

Thus, the only valid criteria for exiting a testing process is the discovery of a forecast amount of bugs.

© Yegor Bugayenko 2014–2018

How to Set Up a Private Maven Repository in Amazon S3

QR code

How to Set Up a Private Maven Repository in Amazon S3

  • Kiev, Ukraine
  • comments

Amazon S3 is a perfect place for keeping private Maven artifacts. I assume you keep public artifacts in Maven Central because you want them to be available to everybody. Private artifacts are those you don't want visible to anyone except members of your team. Thus, you want to deploy your .jar files there and make sure they are visible only by your team. Here is how we do this in all our Java projects.

Create an S3 Bucket

First, you create a new S3 bucket. I would recommend you name it using your project domain and a prefix. For example, with repo.teamed.io, repo is a prefix and teamed.io is the domain.

There's no need to configure any permissions for this bucket. Just create it through the Amazon S3 console.

Create an IAM User

Create a new IAM user. I recommend you name it like teamed-maven if your project name is teamed.

Add a new "inline policy" to the user:

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::repo.teamed.io",
        "arn:aws:s3:::repo.teamed.io/*"
      ]
    }
  ]
}

Here, repo.teamed.io is the name of the S3 bucket you created a minute ago.

Make sure you have an "access key" for this new user. It must look similar to this:

key: AKIAI9NNNJD5D7X4TUVA
secret: t5tZQCwuaRhmlOXfbGE5aTBMFw34iFyxfCEr32av

The key is 20 characters (all caps), and the secret is 40 characters.

Extend settings.xml

Add this configuration to your ~/.m2/settings.xml file:

<settings>
  <servers>
    <server>
      <id>repo.teamed.io</id>
      <username>AKIAI9NNNJD5D7X4TUVA</username>
      <password>t5tZQCwuaRhmlOXfbGE5aTBMFw34iFyxfCEr32av</password>
    </server>
    [...]
  </servers>
  [...]
</settings>

Configure pom.xml

Add this configuration to pom.xml:

<project>
  <distributionManagement>
    <snapshotRepository>
      <id>repo.teamed.io</id>
      <url>s3://repo.teamed.io/snapshot</url>
    </snapshotRepository>
    <repository>
      <id>repo.teamed.io</id>
      <url>s3://repo.teamed.io/release</url>
    </repository>
  </distributionManagement>
  <repositories>
    <repository>
      <id>repo.teamed.io</id>
      <url>s3://repo.teamed.io/release</url>
    </repository>
  </repositories>
  [...]
</project>

Then, configure S3 Wagon, also in pom.xml:

<project>
  <build>
    <extensions>
      <extension>
        <groupId>org.kuali.maven.wagons</groupId>
        <artifactId>maven-s3-wagon</artifactId>
        <version>1.2.1</version>
      </extension>
    </extensions>
    [...]
  </build>
</project>

You're ready to go. You can deploy your artifacts just by running Maven from the command line:

$ mvn clean deploy

Configure s3auth.com

Now you want to see these artifacts in your browser, in a secure mode, by providing secure credentials. I recommend you use s3auth.com, as explained in Basic HTTP Auth for S3 Buckets.

Configure Rultor

badge

Another recommendation is to configure rultor.com for deployment of your artifacts to S3 automatically.

First, encrypt your settings.xml with this Rultor remote:

$ gem install rultor
$ rultor encrypt -p me/test settings.xml

Instead of me/test, you should use the name of your GitHub project.

As a result, you will get a new file named settings.xml.asc. Add it to the root directory of your project, then commit and push. The file contains your secret information, but only the Rultor server can decrypt it.

Create a .rultor.yml file in the root directory of your project (The Rultor reference page explains this format in greater detail):

decrypt:
  settings.xml: "repo/settings.xml.asc"
deploy:
  script: |
    mvn clean deploy --settings ../settings.xml

Now it's time to see how it all works together. Create a new ticket in the GitHub issue tracker and post something like this into it (read more about Rultor commands):

@rultor deploy

You will get a response in a few seconds. The rest will be done by Rultor.

That's it.

© Yegor Bugayenko 2014–2018

Redundant Variables Are Pure Evil

QR code

Redundant Variables Are Pure Evil

  • Kiev, Ukraine
  • comments

A redundant variable is one that exists exclusively to explain its value. I strongly believe that such a variable is not only pure noise but also evil, with a very negative effect on code readability. When we introduce a redundant variable, we intend to make our code cleaner and easier to read. In reality, though, we make it more verbose and difficult to understand. Without exception, any variable used only once is redundant and must be replaced with a value.

Y Tu Mamá También (2001) by Alfonso Cuarón
Y Tu Mamá También (2001) by Alfonso Cuarón

Here, variable fileName is redundant:

String fileName = "test.txt";
print("Length is " + new File(fileName).length());

This code must look differently:

print("Length is " + new File("test.txt").length());

This example is very primitive, but I'm sure you've seen these redundant variables many times. We use them to "explain" the code---it's not just a string literal "test.txt" anymore but a fileName. The code looks easier to understand, right? Not really.

Let's dig into what "readability" of code is in the first place. I think this quality can be measured by the number of seconds I need to understand the code I'm looking at. The longer the timeframe, the lower the readability. Ideally, I want to understand any piece of code in a few seconds. If I can't, that's a failure of its author.

Remember, if I don't understand you, it's your fault.

An increasing length of code degrades readability. So the more variable names I have to remember while reading through it, the longer it takes to digest the code and come to a conclusion about its purpose and effects. I think four is the maximum number of variables I can comfortably keep in my head without thinking about quitting the job.

New variables make the code longer because they need extra lines to be declared. And they make the code more complex because its reader has to remember more names.

Thus, when you want to introduce a new variable to explain what your code is doing, stop and think. Your code is too complex and long in the first place! Refactor it using new objects or methods but not variables. Make your code shorter by moving pieces of it into new classes or private methods.

Moreover, I think that in perfectly designed methods, you won't need any variables aside from method arguments.

© Yegor Bugayenko 2014–2018

Need Robust Software? Make It Fragile

QR code

Need Robust Software? Make It Fragile

  • Dnipro, Ukraine
  • comments

In any software project, the goal is to create something stable. We don't want it to break in front of a user. We also don't want our website to show an "internal application error" instead of a web page. We want our software to work, not fail. That's a perfectly valid and logical desire, but in order to achieve that, we have to make our software as fragile as possible. This may sound counter-intuitive, but that's the way it is. The more fragile your app is in development, the more robust it is in production.

Black Cat, White Cat (1998) by Emir Kusturica
Black Cat, White Cat (1998) by Emir Kusturica

By fragile, I'm referring to the Fail Fast philosophy, which is the opposite of Fail Safe. I believe you know the difference, but let me remind you anyway, by example. This is Fail Safe:

public int size(File file) {
  if (!file.exists()) {
    return 0;
  }
  return file.length();
}

This method is supposed to calculate and return a file size. It first checks whether the file exists. If it doesn't exist, the method returns zero. Indeed, the file is absent, so there is no size. We could complain that the file is absent, but what for? Why make noise? Let's keep it quiet and return zero. We don't fail because we're trying to keep the app running. This is called Fail Safe.

To the contrary, this is how Fail Fast looks:

public int size(File file) {
  if (!file.exists()) {
    throw new IllegalArgumentException(
      "There is no such file; I can't get its length."
    );
  }
  return file.length();
}

We can't find a file? We don't hide this fact. We make this situation public and visible. We scream and cry. We throw an exception. We want the app to crash, break, and fail, because someone gave us a file that doesn't exist. We complain and protest. This is called Fail Fast.

Which philosophy, if we follow it everywhere, will make our software robust and failure-resilient? Only the second one---the Fail Fast.

Why? Because the quicker and easier the failure is, the faster it will be fixed. And the fix will be simpler and also more visible. Fail Fast is a much better approach for maintainability. The code becomes cleaner. It is much easier to track a failure. All methods are ready to break and throw an exception on even the tiniest problem.

In this example, if the method returns zero, it's not obvious whether the file exists and its size is actually zero or if its name is wrong and it is just not found. The Fail Safe approach conceals problems and makes code less maintainable, and that's why it's difficult to stabilize.

In the beginning, during production, we will have many crashes and errors. But all of them will be visible and easy to understand. We will fix them and cover them with unit tests. Each fix will make our software more stable and better covered by tests.

Software designed with the Fail Safe approach in mind will look more stable at the beginning, but it will degrade quickly and inevitably turn into an unmaintainable mess.

Software designed with the Fail Fast approach in mind will crash frequently at the beginning but will improve its stability with every fix and eventually become very stable and robust.

That's why fragility is the key success factor for robustness.

© Yegor Bugayenko 2014–2018

Why Many Return Statements Are a Bad Idea in OOP

QR code

Why Many Return Statements Are a Bad Idea in OOP

  • Kiev, Ukraine
  • comments

This debate is very old, but I have something to say too. The question is whether a method may have multiple return statements or always just one. The answer may surprise you: In a pure object-oriented world, a method must have a single return statement and nothing else. Yes, just a return statement and that's it. No other operators or statements. Just return. All arguments in favor of multiple return statements go against the very idea of object-oriented programming.

This is a classical example:

public int max(int a, int b) {
  if (a > b) {
    return a;
  }
  return b;
}

The code above has two return statements, and it is shorter than this one with a single return:

public int max(int a, int b) {
  int m;
  if (a > b) {
    m = a;
  } else {
    m = b;
  }
  return m;
}

More verbose, less readable, and slower, right? Right.

This is the code in a pure object-oriented world:

public int max(int a, int b) {
  return new If(
    new GreaterThan(a, b),
    a, b
  );
}

What do you think now? There are no statements or operators. No if and no >. Instead, there are objects of class If and GreaterThan.

This is a pure and clean object-oriented approach.

However, Java doesn't have that. Java (and many other pseudo OOP languages) gives us operators like if, else, switch, for, while, etc. instead of giving built-in classes, which would do the same. Because of that, we continue to think in terms of procedures and keep talking about whether two return statements are better than one.

If your code is truly object-oriented, you won't be able to have more than one return. Moreover, you will have nothing except a return in each method. Actually, you will have only two operators in the entire software---new and return. That's it.

Until we're there, let's stick with just one return and at least try to look like pure OOP.

© Yegor Bugayenko 2014–2018

Nine Steps to Start a Software Project

QR code

Nine Steps to Start a Software Project

  • Kiev, Ukraine
  • comments

Agile or not, a software project starts with a requirements analysis and definition. We basically define what needs to be done somehow, be it on a piece of napkin or a 100-page Word document. The next step is to turn this into a working piece of software as fast as possible and by spending as few dollars as possible. Ideally, this prototyping takes a week and is made by an architect working solo. Once the "skeleton" is ready, we start putting software "meat" on it. We recruit a team of programmers for that or outsource it. I see nine important steps in the skeleton creation part; let me show you them one by one.

Ying xiong (2002) by Yimou Zhang
Ying xiong (2002) by Yimou Zhang

Let's use some examples to make this more illustrative. Let's say I'm a software architect and the project is a "Google killer." We're hired to create a new search engine, and my job is to turn requirements into a prototype, a.k.a. a skeleton or a proof of concept. This is what I have as an input (let's say it's a piece of napkin... what else would it be for a Google killer, right?):

Each page is ranked by the number of mentions in
social networks like Twitter, LinkedIn, Facebook, etc.
The more mentions it has, the higher the rank and the
higher its position in the search results page.

Seems like a doable project to me, and the requirements document is clear enough. It doesn't say anything about performance, but I can assume that it has to be as fast as Google. The same goes for scalability, stress resilience, etc.

I'm not going to discuss how the software is created in a specific technical stack. That's not important for this article. What's important now is how my programming work will be "wrapped." In other words, what will I hand off to the team of programmers after a week of hard work---what is my product, or more formally, my deliverables.

Thus, let's assume I managed to create a piece of software and it works.

Decisions and Alternatives

First of all, I have to document my key technical decisions and their alternatives. We usually work in GitHub, and the best documentation media is the README.md file in the root directory of the repo. I just put my text there in a plain Markdown format. That's enough for a good technical document---it has to be short; that's important.

For each decision I made, there has to be at least one alternative that I considered and rejected. There are two items at the top of my list:

Apache Lucene is a search engine. It is popular,
  mature enough, scalable, and written in Java. Alternatives
  are Solr, Sphinx, Gigablast, and many others.
Java 8 is a programming language, and JVM is a
  runtime platform. I know how they work, and the team
  has enough experience with them. Alternatives are
  Ruby, Python, Go, Scala, and tons of others.

These decisions are very high-level, but I still need to document them. As you see, I'm not explaining in details why the alternatives were rejected, and it's my choice. If someone questions my decisions in the future, they may say that the alternatives were not analyzed properly. It will be clear whose fault it was---mine. So I'm taking full responsibility for these two choices I've made: Lucene and Java 8.

Yet another item to the list:

Three modules make up the app: UI, scraper,
  and analyzer. They are fully decoupled and
  communicate strictly through Lucene. I don't
  see any alternatives.

Then, I attach a simple diagram to illustrate my decision:

PlantUML SVG diagram

As you see, in this case, I totally ignored all alternatives. I didn't even mention them. Again, I take full responsibility for that; I said, "I don't see any alternatives." If, later, a better alternative is discovered, it will be obvious why we overlooked it and whose fault it was. It's not only about punishment but about discipline and traceability of decisions. Each decision must be traceable to the person who made it. This helps us avoid bad decisions in the future and makes the entire project more maintainable and transparent.

Let's add one more decision to the list:

Takes Framework is used for UI. It helps keep our
  code truly object-oriented, testable, fast, and
  decoupled from the data model. Alternatives:
  - Spring: It is big, complex, and ugly
  - Play: Similar to Spring, big and ugly
  - Spark: Not as clear as Takes

In this case, I documented the alternatives and gave my reasons why they are not good for us. As you see, the reasons are very biased; I basically expressed my personal opinions about these three frameworks and definitely gave preference to my own open source Takes framework. Is it good? No, it's not. But I'm the architect, and I do what I think is right for the project.

I'm trying to show that the point of this documentation is for me, the architect, to explain my way of thinking---no matter how bad, biased, or irrational it was. I have to write my decisions down and let the project know them all.

I would suggest you keep the number of documented decisions somewhere between four and twelve. If there are fewer than four, I probably forgot to document something important. More than 12---I'm documenting too many non-important decisions. I should use other media for that, like JavaDoc blocks or responsive classes.

Concerns

The next chapter in the README.md file has to explain how exactly I managed to address all concerns expressed in the initial requirements. I mentioned above that it goes without saying that our system must be as fast and scalable as Google. Thus, let's say there are two "concerns"---performance and scalability.

As a software architect, I must address them both. In other words, I have to prove that my solution is fast and scalable. Maybe it's not, but if I believe it is, I have to explain why I think so. I can't be quiet about the concerns. Here is what I would say about performance:

The system is as fast as the Lucene search engine, while
Lucene is rather fast even with large amounts of data.

And this one is about scalability:

The bottleneck is in Lucene, and it is scalable
vertically. Not sure about horizontal scalability.

As you see, I'm trying to be honest and tell the truth. We'll be able to review these statements later and decide whether I was right or wrong. But we need to have my answers to all concerns expressed in the requirements.

Assumptions

The next section is about assumptions I've made while working with the prototype. We usually make assumptions when we don't have enough factual information, and we basically fill the gaps. There is nothing wrong with it, but we have to document which gaps were filled and why.

How about these two assumptions:

1. I assume that social platforms won't block our
   calls and will provide counters for all pages.
2. I assume that Lucene will be enough for both
   indexing and data storage, so we won't need a
   database engine.

I made these assumptions without proper analysis of the situation. I don't know whether Twitter will be happy to see millions of requests every hour coming from our server or not. Maybe it will ban us; I don't know. I don't have to evaluate this and find an exact answer. I just made an assumption and documented it.

Will it be enough to have Lucene only, without any additional data persistence layer? I don't know, but I hope so. I don't have time to do a detailed analysis of our entire data model and its potential future requirements. I just make an assumption and call it a day.

If later, during the handoff, the project sponsor says this assumption exposes too much risk for the project, we'll do a better analysis. For now, my job is to document what I see and move on. Remember, I have just a week of time.

Risks

Now I list all potential problems I foresee and estimate their probability and impact. Let me show you an example first:

1. Lucene may not be able to handle billions of documents [6x9]
2. Social platforms will ban our requests [8x9]

The first number in square brackets is the probability and the second one is the impact, on a 0 to 9 scale. If both numbers are nine, it's not a risk anymore; it's a fact. If both numbers are zero, we can simply ignore this risk.

I listed just two, but in a real system there should be somewhere between four and twelve risks. Too many risks is a sign that the prototype is not focused enough, while too few is due to a lack of attention.

Continuous Integration

Now I have to make sure the product is "wrapped" in continuous integration, which is a critical component of any software package. I have to configure it, preferably in the cloud, and make sure the build is clean.

It is also important to make sure the continuous integration pipeline covers all critical areas, including:

  • Building on multiple platforms, such as Linux, Windows, and Mac.
  • Running integration tests and unit tests.
  • Analyzing statically.
  • Collecting test coverage.
  • Generating documentation.

The stricter the pipeline, the better it is for the project. At this stage, my job, as an architect, is to build a "guard wall" around the product to protect it against future chaos. The chaos will come from programmers making changes through pull requests. They will care much less about the entire quality of the product than I do, and that's why I have to incorporate tools that keep the situation under control.

My goal is to make the continuous integration pipeline as fragile as possible. Any minor error should lead to a build failure. Of course, I'm talking about reproducible failures. The build should fail in a predictable way, not sporadically.

Static Analysis

This is yet another critical component of any software project. You have to analyze the quality of code statically. In the most primitive approach, a static analysis will check the formatting of your source code and fail the build when that formatting is broken. However, in a more advanced variant, static analysis will catch many important bugs.

It is called "static" because it doesn't require the software to be running. To the contrary, unit tests validate software quality in runtime by running the app.

There are many static analysis tools, for almost every language and format. I strongly recommend you use them. Moreover, I recommend you configure them as strictly as possible in order to make the build as fragile as you can. The fragility of the build is a key success factor in software development.

Test Coverage

Test coverage must be collected on every build and, at the very least, reported. In an ideal scenario, low test coverage must fail the build. Let's say I set the required percentage of coverage to 75 percent (it's actually a more complex metric, but in a primitive approach just one number is enough). If someone introduces a new class without a unit test, the coverage percentage goes down and the build breaks.

My job, as an architect creating a prototype, is to make sure the coverage is calculated on every build and is under control---it can't go lower than the threshold I set.

No matter how low the threshold is, what matters is whether it is under control or not.

Continuous Delivery

This is the final step before the handoff. I have to configure a continuous delivery pipeline to make sure the product is packaged and deployed in one click. This is a very important---critically important---step. Without it, everything done before and the piece of software itself is just a collection of files. A piece of software is a product when it is packagable and deployable in one click.

"Pipeline" means that there are a number of elements chained sequentially; for a Java application, for example:

  • Run automated build (the same as in continuous integration)
  • Package JAR file
  • Upload JAR file to repository
  • Build JavaDoc site
  • Upload JavaDoc site to Amazon S3

I'm using Rultor to automate the entire pipeline and simplify its start, stop, and logging. I just post a "please release now" message to a GitHub ticket, and the product is packaged and deployed in a few minutes.

Acceptance

The last step is the handoff---I have to present my solution to the project manager, the sponsor of the project, and the team. Everybody has to accept it. It doesn't mean they will like it, and that's not the goal. The goal is to deliver a complete solution, with risks, assumptions, decisions documented, continuous integration configured, static analysis enforced, etc. If my solution won't be good enough for their criteria, they will change the architect and try again.

My objective is not to satisfy them but to do the best I can according to the requirements and my professional understanding of the problem and business domains. I wrote about this some time ago: A Happy Boss Is a False Objective. Again, my objective is not to make them happy. Instead, my objective is to make a perfect prototype, the way I understand the word perfect. If I fail, I fail. The project will get another architect and try again.

That's it. The skeleton is ready, and my job is done.

© Yegor Bugayenko 2014–2018

Checked vs. Unchecked Exceptions: The Debate Is Not Over

QR code

Checked vs. Unchecked Exceptions: The Debate Is Not Over

  • Sunnyvale, CA
  • comments

Do we need checked exceptions at all? The debate is over, isn't it? Not for me. While most object-oriented languages don't have them, and most programmers think checked exceptions are a Java mistake, I believe in the opposite---unchecked exceptions are the mistake. Moreover, I believe multiple exception types are a bad idea too.

True Romance (1993) by Tony Scott
True Romance (1993) by Tony Scott

Let me first explain how I understand exceptions in object-oriented programming. Then I'll compare my understanding with a "traditional" approach, and we'll discuss the differences. So, my understanding first.

Say there is a method that saves some binary data to a file:

public void save(File file, byte[] data)
  throws Exception {
  // save data to the file
}

When everything goes right, the method just saves the data and returns control. When something is wrong, it throws Exception and we have to do something about it:

try {
  save(file, data);
} catch (Exception ex) {
  System.out.println("Sorry, we can't save right now.");
}

When a method says it throws an exception, I understand that the method is not safe. It may fail sometimes, and it's my responsibility to either 1) handle this failure or 2) declare myself as unsafe too.

I know each method is designed with a single responsibility principle in mind. This is a guarantee to me that if method save() fails, it means the entire saving operation can't be completed. If I need to know what the cause of this failure was, I will un-chain the exception---traverse the stack of chained exceptions and stack traces encapsulated in ex.

I never use exceptions for flow control, which means I never recover situations where exceptions are thrown. When an exception occurs, I let it float up to the highest level of the application. Sometimes I rethrow it in order to add more semantic information to the chain. That's why it doesn't matter to me what the cause of the exception thrown by save() was. I just know the method failed. That's enough for me. Always.

For the same reason, I don't need to differentiate between different exception types. I just don't need that type of hierarchy. Exception is enough for me. Again, that's because I don't use exceptions for flow control.

That's how I understand exceptions.

According to this paradigm, I would say we must:

  • always use checked exceptions.

  • never throw/use unchecked exceptions.

  • use only Exception, without any sub-types.

  • always declare one exception type in the throws block.

  • never catch without rethrowing; read more about that here.

This paradigm diverges from many other articles I've found on this subject. Let's compare and discuss.

Runtime vs. API Exceptions

Oracle says some exceptions should be part of API (checked ones) while some are runtime exceptions and should not be part of it (unchecked). They will be documented in JavaDoc but not in the method signature.

I don't understand the logic here, and I'm sure Java designers don't understand it either. How and why are some exceptions important while others are not? Why do some of them deserve a proper API position in the throws block of the method signature while others don't? What is the criteria?

I have an answer here, though. By introducing checked and unchecked exceptions, Java developers tried to solve the problem of methods that are too complex and messy. When a method is too big and does too many things at the same time (violates the single responsibility principle), it's definitely better to let us keep some exceptions "hidden" (a.k.a. unchecked). But it's not a real solution. It is only a temporary patch that does all of us more harm than good---methods keep growing in size and complexity.

Unchecked exceptions are a mistake in Java design, not checked ones.

Hiding the fact that a method may fail at some point is a mistake. That's exactly what unchecked exceptions do.

Instead, we should make this fact visible. When a method does too many things, there will be too many points of failure, and the author of the method will realize that something is wrong---a method should not throw exceptions in so many situations. This will lead to refactoring. The existence of unchecked exceptions leads to a mess. By the way, checked exceptions don't exist at all in Ruby, C#, Python, PHP, etc. This means that creators of these languages understand OOP even less than Java authors.

Checked Exceptions Are Too Noisy

Another common argument against checked exceptions is that they make our code more verbose. We have to put try/catch everywhere instead of staying focused on the main logic. Bozhidar Bozhanov even suggests a technical solution for this verbosity problem.

Again, I don't understand this logic. If I want to do something when method save() fails, I catch the exception and handle the situation somehow. If I don't want to do that, I just say my method also throws and pay no attention to exception handling. What is the problem? Where is the verbosity coming from?

I have an answer here, too. It's coming from the existence of unchecked exceptions. We simply can't always ignore failure, because the interfaces we're using don't allow us to do this. That's all. For example, class Runnable, which is widely used for multi-thread programming, has method run() that is not supposed to throw anything. That's why we always have to catch everything inside the method and rethrow checked exceptions as unchecked.

If all methods in all Java interfaces would be declared either as "safe" (throws nothing) or "unsafe" (throws Exception), everything would become logical and clear. If you want to stay "safe," take responsibility for failure handling. Otherwise, be "unsafe" and let your users worry about safety.

No noise, very clean code, and obvious logic.

Inappropriately Exposed Implementation Details

Some say the ability to put a checked exception into throws in the method signature instead of catching it here and rethrowing a new type encourages us to have too many irrelevant exception types in method signatures. For example, our method save() may declare that it may throw OutOfMemoryException, even though it seems to have nothing to do with memory allocation. But it does allocate some memory, right? So such a memory overflow may happen during a file saving operation.

Yet again, I don't get the logic of this argument. If all exceptions are checked, and we don't have multiple exception types, we just throw Exception everywhere, and that's it. Why do we need to care about the exception type in the first place? If we don't use exceptions to control flow, we won't do this.

If we really want to make our application memory overflow-resistant, we will introduce some memory manager, which will have something like the bigEnough() method, which will tell us whether our heap is big enough for the next operation. Using exceptions in such situations is a totally inappropriate approach to exception management in OOP.

Recoverable Exceptions

Joshua Bloch, in Effective Java, says to "use checked exceptions for recoverable conditions and runtime exceptions for programming errors." He means something like this:

try {
  save(file, data);
} catch (Exception ex) {
  // We can't save the file, but it's OK
  // Let's move on and do something else
}

How is that any different from a famous anti-pattern called Don't Use Exceptions for Flow Control? Joshua, with all due respect, you're wrong. There are no such things as recoverable conditions in OOP. An exception indicates that the execution of a chain of calls from method to method is broken, and it's time to go up through the chain and stop somewhere. But we never go back again after the exception:

App#run()
  Data#update()
    Data#write()
      File#save() <-- Boom, there's a failure here, so we go up

We can start this chain again, but we don't go back after throw. In other words, we don't do anything in the catch block. We only report the problem and wrap up execution. We never "recover!"


All arguments against checked exceptions demonstrate nothing but a serious misunderstanding of object-oriented programming by their authors. The mistake in Java and in many other languages is the existence of unchecked exceptions, not checked ones.

© Yegor Bugayenko 2014–2018

Hourly Pay Is Modern Slavery

QR code

Hourly Pay Is Modern Slavery

  • Palo Alto, CA
  • comments

What is the difference between a slave and a free man? A slave has a master. That master, a cruel and ruthless villain, is telling the slave what to do and punishing him at will. However, at the same time, the master guarantees food and a roof over the slave's head. A slave understands this exchange of freedom for food as a fair trade, while a free man values freedom more. That's the difference. That's how it was in the Roman empire centuries ago. But isn't that how it is now too? Aren't we ancient slaves in our offices and in front of our monitors?

Gladiator (2000) by Ridley Scott
Gladiator (2000) by Ridley Scott

This post is partially provoked by a recently published semi-historical book How to Manage Your Slaves by Jerry Toner. However, there's more. The entire eXtremely Distributed Software Development (XDSD) methodology is based on a primary fundamental principle that states, "Everyone gets paid for verified deliverables." They're not paid for their time per hour, per week, or per month, but rather for verified deliverables. What is the difference and what does all this have to do with slavery? Let's see.

badge

My point here is that any payment schedule based on time instead of results is turning us into slaves.

Here is an example. Say I'm a software developer and I need some money to pay my bills, buy a car, rent a house, and enjoy my life. I have some skills for that. I can write Java software. I find a company that needs my skills. It hires me, and we sign a contract. The contract says I have to be in the office from 9 until 5 and I have to do what my boss tells me to do. In exchange, I will get paychecks every few weeks, which will cover my expenses.

Doesn't it look similar to what I just said above about ancient slaves? The master (a.k.a. CEO) tells me what to do and punishes me at will; in exchange, he gives me food and safety.

The problem here is not about punishment. This model of work makes me think like a slave. I think this is a fair exchange---I give away my freedom (my time), and I get back food and safety. I'm a slave not because the master is punishing me and I'm sitting all day long in fetters. Absolutely not. A slave is not the one who was captured and imprisoned. A slave is one who thinks slavery is a fair way of management.

Letting someone tell me what to do with my time in order to get food in return is exactly what slavery is about.

What is freedom, then? How should a free man make money?

A free man sells the results of his work. A free man cleans someone's house and bills him when the work is done. A free man drives passengers from the airport to their home and bills them when they get there. A free man creates a software module and bills the client when it's ready. A free man translates a document and bills per page. A free man cooks a cake and bills for it.

A free man sells results, not time.

Also, a free man takes care of the food and security on his or her own.

Is it more risky? Yes. Is it more stressful? Yes. But that's what freedom is about.

This is my favorite quote from the book: "a good slave is loyal, hard-working and vigilant."

Think about it.

© Yegor Bugayenko 2014–2018

Fools Don't Write Unit Tests

QR code

Fools Don't Write Unit Tests

  • Palo Alto, CA
  • comments

"We don't have time to write unit tests" or "We don't have the budget for unit testing" are complaints I hear very often. Sometimes it may sound like, "We don't use TDD, so that's why there are no unit tests," or even "TDD is too expensive for us now." I'm sure you've heard this or even said it yourself. It doesn't make any sense to me. I don't get the logic. In my understanding, unit testing is not a product; it's a tool. You use tests to develop a product faster and better. How can you say you don't have time to use the tool that makes your work faster? Let me show you how.

Ex Machina (2015) by Alex Garland
Ex Machina (2015) by Alex Garland

TDD or not, a unit test is a unit test. Either you create it before the main piece of code or after it.

A unit test is a tool that helps you, a developer of software, "run" your stuff and see how it works. How else can you check if it works? When I hear, "I don't have time for unit tests," my next question is: "How did you test your code?"

I seriously can't understand how it is possible to write something and then not test it. Well, unless you're paid monthly and nobody really cares about your deliverables. If you do care about the software you produce, you're interested in seeing it in action, right?

So, how do you do this?

If it's a one-page PHP website, you can probably run it locally on Apache, modify it on disk, and then Cmd+R many times. That will work for a primitive piece of code and only for you, a single developer of it. But I hear this "I don't have time" argument from programmers working on enterprise systems. How do you guys test your code?

I would compare unit tests with OOP classes. You can design the entire application in a single class with a few thousand methods. You will save time on creating other classes, structuring them, thinking about connections between them, etc. It will be a single 20,000-line .java file. And you'll say that "you didn't have time to create classes," right? What would we say about such a product and the author of it? Right, we'd say he or she is just stupid. And it has nothing to do with time or budget. Such a programmer just doesn't know how to use object-oriented programming tools, like encapsulation, inheritance, polymorphism, interfaces, method overloading, etc. It's not about time or budget; it's about skills and discipline.

The same is true for unit tests. If you create code without unit tests, it may work, just like that monster class with 20,000 lines, but the quality of your product will be very low. And not because you didn't have time to write unit tests, but because you didn't know how to do it.

So every time I hear, "I didn't have time for unit testing," I understand that you just didn't know how and are trying to conceal that fact behind false excuses. It's not professional, to say the least.

© Yegor Bugayenko 2014–2018

Meetings Are Legalized Robbery

QR code

Meetings Are Legalized Robbery

  • Washington, D.C.
  • comments

Software development is all about creativity, right? It's an art, not a science. As software engineers, we solve complex problems, and often our solutions are absolutely not obvious. We experiment, innovate, research, and investigate. To do all this, we talk. We sit together in our meeting rooms, Skype conference calls, or Slack channels; we discuss our solutions; we ask our coworkers for their opinions; and we argue about the best ideas. There's no doubt meetings are the key component of the modern software design discipline ... and it's very sad to see it.

Heat (1995) by Michael Mann
Heat (1995) by Michael Mann

A good software architect, as well as a good project manager, doesn't need meetings and never organizes them.

Meetings demotivate, waste time, burn money, and degrade quality. But more about that later. For now, let's discuss a proposed alternative.

Say I'm an architect who needs to design the schema for a relational database in a new project, and I have a team of five programmers whom I want to help me with this design. That's a very logical and appropriate desire, as a good architect always discusses all possible options with available team members before making a final decision. So I call a meeting? No!

A Good Architect

I ask Jeff, one of our programmers, to create a draft of the DB schema, but I don't actually talk to him about it. I value and respect his time---there's no need to bother him with this organizational noise, so I just create a ticket and assign it to Jeff. When he has time, he creates a draft and returns a pull request. I review it and make some comments before he updates the branch, and finally I merge it.

Perfect; we have a draft.

At the end of the document, Jeff also listed assumptions, risks, and concerns. For example, this is what I got back from him (it's Markdown, a very convenient format for simple technical documents; I highly recommend it):

## Tables
user (id INT, name VARCHAR, email VARCHAR);
payment (id INT, date DATETIME, amount INT);
order (id INT, details VARCHAR, user_id INT FK(user));

## Assumptions
- All payments will be in whole dollars, no cents.
- All users will have only one email.
- There will be no search feature required.

## Risks
- Order details may not fit into VARCHAR.
- Foreign keys may not be supported in the DBMS.

## Concerns
- Would NoSQL be more suitable?
- What is the DB server we'll use?

I don't know how much time Jeff would have spent on this document, but I just created it in 10 minutes. Of course, it's a very simple schema for a very simple project. But even if Jeff spent an hour on it, every minute of that hour is valuable to the project. What I mean is that every dollar I pay Jeff for his time is returned to me in the form of a text document.

Now I have a draft and I'm taking the next step. I ask Monica to take a look at it and suggest changes. Again, it's another hour and I've got back a pull request with changes, corrections, new assumptions, new risks, and new concerns. I'm not talking to Monica---there is no need for that. She has all the information she needs to work with the DB schema. She is a good engineer, and I trust her ability to formulate her opinion in a written format.

There's no need to sit together in the same room or stand at a whiteboard. Monica is smart enough to do this job by herself. She already has all the ideas expressed by Jeff in front of her; there is no need for her to talk to him either.

Once I merge her pull request (after a proper review and corrections), I have a new version of my schema.md document.

Moreover, I have a Git history of this document, and I have a history of pull requests with comments. This is way better than meeting notes or even a meeting video. Anyone who joins the project later will be able to understand how we came to the conclusion of using PostgreSQL and storing monetary amounts without cents. It's all there in the Git history and GitHub tickets, forever with us.

What if I need more opinions? Or if I'm still not sure the schema is good enough? No problem; I ask someone else to review it one more time and send me a pull request with changes. I can even ask Jeff to do it again after a few iterations with other programmers!

Moreover, I can add my own concerns to the document, and on the next iteration, ask Jeff to pay attention to and resolve them.

The more we circle this document, the better it becomes. And every minute the project pays for turns into a tangible product: a document with a change history!

That's how a professional architect collects opinions and makes complex decisions. Now let's see what a bad architect would do.

A Bad Architect

I first call a meeting. No, wait; I schedule it in Google Calendar. No, wait, wait. First of all, I create an agenda:

1. Introduction: 10 min
2. Problem: 15 min
  - We need a DB schema
  - Let's choose a server
3. Opinions: 15 min
  - Jeff and Monica have experience
  - Others?
4. Coffee break: 10 min
5. Discussion: 30 min
  - Let's not forget risks
  - Ask Joe about PostgreSQL
6. Conclusions: 10 min

I'm sure you know what I'm talking about and you've seen these agendas from your "architects." Anyway, my first step is done. I've scheduled an hour-and-a-half meeting where all programmers will be present. We'll have fun and drink coffee. We'll discuss the problem, hear all opinions, and find the best solution. We'll document it in that schema.md and get back to our tasks.

Instead of circulating those dry and boring Git documents, we'll have a real human communication with a nice coffee break where we'll share our opinions about the last episode of The Bing Bang Theory. Isn't that what we all like about our office jobs anyway?

I don't think so.

I think meetings demotivate, waste time, burn money, and degrade quality. Those who organize them either have no idea what they are doing or are silently robbing the company they're working for.

We needed meetings 30 years ago when we didn't have laptops and GitHub. But even then, we had a pen and paper.

I'm talking about meetings that are intended to collect or distribute information, discuss or propose something, or find a solution in a technical domain. The only valid purpose of meetings is to read body language of the people you are dealing with. Politicians, businessmen, poker players, shrinks, physicians, etc.---they need meetings because they must read our bodies, not just our thoughts.

Do we really care about Monica's body while we're designing a DB schema? Well, that depends, right? But let's be serious; we're not paid for that.

Meetings Demotivate

The best motivation for a creative person is to see the results of his or her creativity. If I'm not mistaken, enjoying the process of creativity without any results is an obvious sign of mental illness. A healthy and creative person like a software engineer, for example, wants to see how his or her efforts are turned into something valuable and, preferably, tangible.

Meetings almost never produce anything tangible and rarely something valuable. Sometimes we have "minutes" of our meetings, but they are just short pieces of paper with a brief summary of what we were talking about. Not a real "product" for a creative person.

Thus, they demotivate me because I simply don't see what two hours of my life were turned into. We don't really create anything there, we just discuss. Pay attention here; I'm talking about meetings, not about collaborative work on something, like in pair programming, for example.

You may say that some meetings actually produce great ideas, which are very tangible results. You may say that only in a direct, face-to-face setting could these ideas be born. You may also say that many bright technical decisions were invented precisely due to a magic synergy of friends thinking in the same direction, at the same time, and in the same room. This argument makes sense, but I'll address that later.

Meetings Waste Time

I think it's obvious. For the first few minutes of the meeting, you're concentrated; then you start checking your Twitter feed and doodling. Everybody is doing the same---not because we're lazy but because there is no demand for our full focus at the meeting.

While Monica is working with the document, making comments and suggestions, she is fully concentrated on the result---mostly because there is nobody else to help her. She has to deliver that pull request, and I'm waiting for her. Her concentration is as high as it would be at the meeting, when someone is asking her a direct question and she has to provide a detailed answer.

Her time is optimized for a suitable outcome while everybody else is doing something else.

At the meeting, on the other hand, we're all concentrating sparingly at best. And the longer the meeting, the slower we are. Also, the more people who are there, the less we care and the more interested we become in our Facebook updates. I reckon that's not much of a surprise to you if you've been in the software development industry for at least a few months.

Meetings Burn Money

This aligns closely with the previous conclusion. Meetings are among the biggest budget consumers of any type of activity in a project, simply because they pay everybody who is sitting in that room or on that Skype conference while the result they produce is almost equal to what a single person can deliver. Or much less.

Even though this may sound extreme, I have to say that I consider meetings a legalized way to rob a project. Most meeting organizers (architects, CTOs, CEOs, programmers, etc.) don't realize that. They believe the more they talk, the better engineers they are. And their bosses, by mistake, appreciate that sort of activity from their subordinates.

It's robbery!

I told you to create a draft of a DB schema. Now you're coming to me and asking for a meeting because certain "aspects" are not clear? Did you study software engineering anywhere? Do you know how to work with technical documents? Are you capable of writing in a way that everybody else can understand and respond to you, also in writing? No? Now you want the project to not only pay you for the DB schema draft but to also pay me for talking to you and for a few other guys to sit next to us and text their friends? You basically want to rob the owner of this project. No more, no less.

Meetings Degrade Quality

Quality goes up when it is being controlled. When someone tells me every day that my code is buggy and needs improvements, my quality goes up. When nobody cares what I do or how good my results are, my quality goes down, no matter how self-motivated I am.

Meetings promote group achievements and discourage individual ones. At the end of a meeting, it's often not clear who exactly suggested a good idea and who made the biggest effort. In other words, at the end of a meeting, I don't know how good I am. Am I still the same as a month ago or am I a better engineer?

They smile more and ask me "what do you think?" more frequently than last summer; that must mean something, right? I'm sure you understand this is not the kind of feedback a serious engineer would expect.

A serious software developer wants to produce tangible results and receive tangible feedback, like money or code reviews. I want to know what was wrong in my DB schema draft and what I missed. And I want this to be documented somewhere. This is what makes me better, and this is how I learn and grow.

What About the A-ha! Moment?

Now, what about true creativity or that well-known "a-ha!" moment? Sometimes it's necessary to "think out loud" in order to invent something, right? We all know how important a face-to-face interaction could be when we're researching and developing something new. Where would we be without meetings? We can't simply work with documents; we need to talk to each other in order to let our ideas out. Isn't it obvious?

I have only one argument for that. Did Einstein invent his theory at a meeting with his colleagues? I don't think so. And he was solving a much bigger problem than a DB schema design.


Let me summarize. Meetings are an activity that requires almost no skill, while documenting ideas in text and diagrams is a way more difficult job to do. So train and discipline yourself to work with documents and let juniors enjoy their meetings.

© Yegor Bugayenko 2014–2018

Catch Me If You ... Can't Do Otherwise

QR code

Catch Me If You ... Can't Do Otherwise

  • Dallas, TX
  • comments

I don't know whether it's an anti-pattern or just a common and very popular mistake, but I see it everywhere and simply must write about it. I'm talking about exception catching without re-throwing. I'm talking about something like this Java code:

try {
  stream.write(data);
} catch (IOException ex) {
  ex.printStackTrace();
}
Catch Me If You Can (2002) by Steven Spielberg
Catch Me If You Can (2002) by Steven Spielberg

Pay attention: I don't have anything against this code:

try {
  stream.write('X');
} catch (IOException ex) {
  throw new IllegalStateException(ex);
}

This is called exception chaining and is a perfectly valid construct.

So what is wrong with catching an exception and logging it? Let's try to look at the bigger picture first. We're talking about object-oriented programming---this means we're dealing with objects. Here is how an object (its class, to be exact) would look:

final class Wire {
  private final OutputStream stream;
  Wire(final OutputStream stm) {
    this.stream = stm;
  }
  public void send(final int data) {
    try {
      this.stream.write(x);
    } catch (IOException ex) {
      ex.printStackTrace();
    }
  }
}

Here is how I'm using this class:

new Wire(stream).send(1);

Looks nice, right? I don't need to worry about that IOException when I'm calling send(1). It will be handled internally, and if it occurs, the stacktrace will be logged. But this is a totally wrong way of thinking, and it's inherited from languages without exceptions, like C.

Exceptions were invented to simplify our design by moving the entire error handling code away from the main logic. Moreover, we're not just moving it away but also concentrating it in one place---in the main() method, the entry point of the entire app.

The primary purpose of an exception is to collect as much information as possible about the error and float it up to the highest level, where the user is capable of doing something about it. Exception chaining helps even further by allowing us to extend that information on its way up. We are basically putting our bubble (the exception) into a bigger bubble every time we catch it and re-throw. When it hits the surface, there are many bubbles, each remaining inside another like a Russian doll. The original exception is the smallest bubble.

When you catch an exception without re-throwing it, you basically pop the bubble. Everything inside it, including the original exception and all other bubbles with the information inside them, are in your hands. You don't let me see them. You use them somehow, but I don't know how. You're doing something behind the scenes, hiding potentially important information.

If you're hiding that from me, I can't promise my user that I will be honest with him and openly report a problem when it occurs. I simply can't trust your send() method anymore, and my user will not trust me.

By catching exceptions without re-throwing them, you're basically breaking the chain of trust between objects.

My suggestion is to catch exceptions as seldom as possible, and every time you catch them, re-throw.

Unfortunately, the design of Java goes against this principle in many places. For example, Java has checked and un-checked exceptions, while there should only be checked ones in my opinion (the ones you must catch or declare as throwable). Also, Java allows multiple exception types to be declared as throwable in a single method---yet another mistake; stick to declaring just one type. Also, there is a generic Exception class at the top of the hierarchy, which is also wrong in my opinion. Besides that, some built-in classes don't allow any checked exceptions to be thrown, like Runnable.run(). There are many other problems with exceptions in Java.

But try to keep this principle in mind and your code will be cleaner: catch only if you have no other choice.

P.S. Here is how the class should look:

final class Wire {
  private final OutputStream stream;
  Wire(final OutputStream stm) {
    this.stream = stm;
  }
  public void send(final int data)
    throws IOException {
    this.stream.write(x);
  }
}

© Yegor Bugayenko 2014–2018

Public Static Literals ... Are Not a Solution for Data Duplication

QR code

Public Static Literals ... Are Not a Solution for Data Duplication

  • Palo Alto, CA
  • comments

I have a new String(array,"UTF-8") in one place and exactly the same code in another place in my app. Actually, I may have it in many places. And every time, I have to use that "UTF-8" constant in order to create a String from a byte array. It would be very convenient to define it once somewhere and reuse it, just like Apache Commons is doing; see CharEncoding.UTF_8 (There are many other static literals there). These guys are setting a bad example! public static "properties" are as bad as utility classes.

The Shining (1980) by Stanley Kubrick
The Shining (1980) by Stanley Kubrick

Here is what I'm talking about, specifically:

package org.apache.commons.lang3;
public class CharEncoding {
  public static final String UTF_8 = "UTF-8";
  // some other methods and properties
}

Now, when I need to create a String from a byte array, I use this:

import org.apache.commons.lang3.CharEncoding;
String text = new String(array, CharEncoding.UTF_8);

Let's say I want to convert a String into a byte array:

import org.apache.commons.lang3.CharEncoding;
byte[] array = text.getBytes(CharEncoding.UTF_8);

Looks convenient, right? This is what the designers of Apache Commons think (one of the most popular but simply terrible libraries in the Java world). I encourage you to think differently. I can't tell you to stop using Apache Commons, because we just don't have a better alternative (yet!). But in your own code, don't use public static properties---ever. Even if this code may look convenient to you, it's a very bad design.

The reason why is very similar to utility classes with public static methods---they are unbreakable hard-coded dependencies. Once you use that CharEncoding.UTF_8, your object starts to depend on this data, and its user (the user of your object) can't break this dependency. You may say that this is your intention, in the case of a "UTF-8" constant---to make sure that Unicode is specifically and exclusively being used. In this particular example, this may be true, but look at it from a more global perspective.

Let me show you the alternative I have in mind before we continue. Here is what I'm suggesting instead to convert a byte array into a String:

String text = new UTF8String(array);

It's pseudo-code, since Java designers made class String final and we can't really extend it and create UTF8String, but you get the idea. In the real world, this would look like this:

String text = new UTF8String(array).toString();

As you see, we encapsulate the "UTF-8" constant somewhere inside the class UTF8String, and its users have no idea how exactly this "byte array to string" conversion is happening.

By introducing UTF8String, we solved the problem of "UTF-8" literal duplication. But we did it in a proper object-oriented way---we encapsulated the functionality inside a class and let everybody instantiate its objects and use them. We resolved the problem of functionality duplication, not just data duplication.

Placing data into one shared place (CharEncoding.UTF_8) doesn't really solve the duplication problem; it actually makes it worse, mostly because it encourages everybody to duplicate functionality using the same piece of shared data.

My point here is that every time you see that you have some data duplication in your application, start thinking about the functionality you're duplicating. You will easily find the code that is repeated again and again. Make a new class for this code and place the data there, as a private property (or private static property). That's how you will improve your design and truly get rid of duplication.

PS. You can use a method instead of a class, but not a static literal.

© Yegor Bugayenko 2014–2018

The Better Architect You Are, The Simpler Your Diagrams

QR code

The Better Architect You Are, The Simpler Your Diagrams

  • Palo Alto, CA
  • comments

I don't even know where to start. Let's begin with this: If I don't understand you, it's your fault. This has to be the most basic, fundamental principle of a good software architect (well, of any engineer), but most of the architects I've met so far, in many companies, don't seem to believe in it. They don't understand that the job of a software architect is to make complex things simple, not the other way around. They use diagrams, which are the most popular instruments of an architect, to explain to us, programmers, what he or she has in mind. But the diagrams are usually very cryptic and hard to digest. What's worse is that the complexity goes up in line with their salaries---it's disgusting.

A Beautiful Mind (2001) by Ron Howard
A Beautiful Mind (2001) by Ron Howard

Why is this happening? Why are their diagrams complex and difficult to read? I'm sure you know what I'm talking about; you probably have your own examples of such diagrams from projects and architects you've worked with. So why do we have them?

Architects are proud of complexity, that's why. They think that the more complex the problem they're working with, the better an engineer they are. I've had this dialog many times:

- You know, your diagram looks so complex ...
- Oh yeah, we're solving complex problems here!

Usually, after that, the architect smiles with an obvious feeling of satisfaction. Indeed, someone actually noticed how difficult his job is and appreciated his efforts. Someone is stupid, and he is smart. He can understand this multi-tier architecture, and I can't. He definitely earns my respect, right?

Wrong! A good architect knows his main role is to decompose a complex problem into less complex components and let programmers solve them one by one. Just as a good project manager has to decompose a complex task into smaller ones. When the problem is properly decomposed (broken down into smaller, isolated and properly decoupled pieces), the complexity decreases, and it becomes easier for everybody to understand and resolve.

The main virtue of an architect is the ability to reduce complexity. Thus, a good architect would never be proud of a complex diagram. Instead, he would be proud of a simple and easy-to-understand drawing with a few rectangles that perfectly explain an entire multi-tier application. That is what is really difficult to do. That's where a true architectural mind shines.

There are not many architects like that. I can't say I'm one of them yet, but I have a few recommendations for your diagrams. Read on and remember that the main goal of all this is to reduce complexity.

No More Than Five Rectangles. If you have more, there is something wrong. Try to explain yourself in less than five. Just group some of them together and give it a name. You don't want me to spend more than a few seconds trying to understand who is participating in the show you're presenting. I want to see them all at one glance and immediately understand who is who. I just made up the number five, but you get the idea---make sure all diagram participants are easy to count. I've seen diagrams with 25 or more rectangles ... that's unacceptable.

badge

Use UML. Well, use whatever notation you feel comfortable with, but many years ago people agreed that instead of using different notations, it would be easier to learn one for all; that's UML. It's a huge format/standard/language, but you don't need to know all of it. Just learn the basics; that will be enough to express almost any idea you have. I would recommend UML Distilled: A Brief Guide to the Standard Object Modeling Language (3rd Edition) by Martin Fowler.

Direct and Annotate Lines. There is nothing more annoying on a diagram than a line connecting two rectangles without any text on it and without any direction. Is it a flow of data? Is it a compile-time dependency? There are many possible meanings. Always use arrows, and always annotate them---this will help me understand you much faster.

Don't Use Colors. Or let me put it this way: Don't abuse colors. And in order to avoid abusing them, you be better off staying away from colors in the first place. If you need to use colors, there must be something wrong with your diagram. It's probably too complex; that's why you need to use colors. Simplify it by grouping elements.

Don't Be Creative. It's not art; it's engineering. You don't need to impress me; you need to deliver the message. Your goal is not to show how sophisticated your mind is. Moreover, your diagram style should not be personal. A diagram from you and a diagram from another architect should look almost exactly the same if they deliver the same message. It's call uniformity. That's how you make them easier for me to understand. I don't want to have to learn your personality in order to understand your diagram. If it's a server, draw a rectangle. There's no need to put a 3D picture of an HP server there. A rectangle is enough. Also, please no shades, no fonts, and no styles. Again, it's not an artistic contest. I will understand your rectangle pretty well without that "nice" shadow you're tempted to drop. I will also understand an arrow with a default width; no need to make it wider just because your diagram editor allows you to. Don't waste your time and my time on all this styling. Just focus on those simple lines, rectangles, text, and arrows.


As I mentioned above, the goal of all this is to reduce complexity and help me, a programmer, understand you, an architect. Remember, if I can't understand you, it's your fault. You're a bad architect if you can't deliver your ideas in a plain, simple form.

© Yegor Bugayenko 2014–2018

XML Data and XSL Views in Takes Framework

QR code

XML Data and XSL Views in Takes Framework

  • Palo Alto, CA
  • comments
badge

A year ago, I tried to explain how effectively data and its presentation can be separated in a web application with the help of XML and XSL. In a few words, instead of using templating (like JSP, Velocity, FreeMarker, etc.) and injection of data into HTML, we compose them in the form of an XML document and then let the XSL stylesheet transform them into HTML. Here is a brief example of how all this can be used together with the Takes framework.

First, let's agree that templating is a bad idea in the first place. Yes, I mean it. The entire design of JSP is wrong, with all due respect to its creators. Here is how it works: Let's say my website has to fetch the current exchange rate of the euro from a database and show it on the home page. Here's how my index.jsp would look:

<html>
  <body>
    <p>EUR/USD: <%= rates.get() %></p>
  </body>
</html>

In order to create HTML, the JSP engine will have to call get() on object rates and render what's returned through toString(). It's a terrible design for a few reasons. First, the view is tightly coupled with the model. Second, the flexibility of rendering is very limited. Third, the result of rendering is not reusable, and views are not stackable. There are many other reasons ... more about them in one of the next articles.

Let's see how this should be done right. First, we let our model generate the output in XML format, for example:

<?xml version="1.1"?>
<page>
  <rate>1.1324</rate>
</page>

This is what the model will produce, having no knowledge of the view. Then, we create the view as an XSL stylesheet, which will transform XML into HTML:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns="http://www.w3.org/1999/xhtml">
  <xsl:template match="page">
    <html>
      <body>
        <p>
          <xsl:text>EUR/USD: </xsl:text>
          <xsl:value-of select="rate"/>
        </p>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

As you see, the view doesn't know anything about the model in terms of implementation. All it knows is the format of XML data output produced by the model. Here is how you design it in the Takes framework. Let's start with a simple example:

import org.takes.http.Exit;
import org.takes.http.FtCli;
public final class Entry {
  public static void main(final String... args) throws Exception {
    new FtCli(new TkApp(), args).start(Exit.NEVER);
  }
}

It's a simple web application that starts a web server and never ends (it waits for connections in daemon mode). To make it work, we should create a simple "take" named TkApp:

import org.takes.Take;
import org.takes.tk.TkWrap;
final class TkApp extends TkWrap {
  @Override
  public Response act() {
    return new RsWithType(
      new RsText(
        "<page><rate>1.1324</rate></page>"
      ),
      "application/xml"
    );
  }
}

This "take" always returns the same XML response, but it doesn't do any XSL transformation yet. We need to add the RsXSLT class to the picture:

@Override
public Response act() {
  return new RsXSLT(
    new RsWithType(
      new RsText(
        "<?xml-stylesheet type='text/xsl' href='/xsl/index.xsl'?>"
        + "<page><rate>1.1324</rate></page>"
      ),
      "application/xml"
    )
  );
}

Excuse me for using string concatenation, which is a bad practice; it's merely for the simplicity of the example.

As you see, I also added an XML stylesheet processing instruction to the XML. RsXSLT will understand it and try to find the /xsl/index.xsl resource on classpath. You see the content of that file above.

That's it.

Well, not really. Building XML from strings is not a good idea. We have a better instrument in the Takes framework. We use Xembly, which is a simple imperative language for building and modifying XML documents. More about it here: Xembly, an Assembly for XML.

Here is how our TkApp would look:

@Override
public Response act() {
  return new RsXSLT(
    new RsWithType(
      new RsXembly(
        new XeChain(
          new XeStylesheet("/xsl/index.xsl"),
          new XeAppend(
            "page",
            new XeDirectives(
              new Directives().add("rate").set("1.1324")
            )
          )
        )
      ),
      "application/xml"
    )
  );
}

The most important class here is RsXembly. The idea is to let model classes expose their data through Xembly "directives," which will later be applied to a DOM structure by RsXembly.

XeChain, XeStylesheet, XeAppend, and XeDirectives expose directives but with different meanings (they are all instances of an XeSource interface). Their names describe their intentions rather well. XeChain just chains everything that is delivered by encapsulated "directive sources." XeStylesheet returns directives that create a single XML processing instruction. XeAppend creates an XML node and adds encapsulated directives to it. XeDirectives simply returns what's inside.

In the end, this code will create exactly the same XML document as I created above with string concatenation.

The beauty of this approach is in the perfect decoupling of data generation and XML building and translation between XML and HTML. It is perfectly reusable and "stackable." We can transform the data in XML format multiple times, applying different XSL stylesheets to each one. We can even transform them into JSON without changing a line of code in model classes.

Moreover, we can format them differently, using rather powerful XSLT 2.0 instruments. XSLT by itself is a powerful and purely functional language that enables any possible data manipulations. No templating engine is even close to it.

Take a look at how it works in the RsPage class in Rultor, for example.

© Yegor Bugayenko 2014–2018

A Few Valid Reasons to Reject a Bug Fix

QR code

A Few Valid Reasons to Reject a Bug Fix

A bug exists when something doesn't work as expected. A bug fix is basically a patch (a pull request) to the existing code base that is supposed to solve the problem and make sure that "something" works as expected. Very often, such a patch fixes one thing and breaks many others. I believe that sometimes it's necessary to reject a bug fix and ask its author to re-do the patch in order to protect the project from bigger problems. There are a few valid reasons for such a rejection, according to my experience.

El Crimen Perfecto (2004) by Álex de la Iglesia
El Crimen Perfecto (2004) by Álex de la Iglesia

It Degrades Code Coverage

This is a very common situation: After the changes are made in one place, unit tests fail in some other place. The bug is fixed, but some possibly unrelated unit tests start to report failure. Under pressure or simply because we're lazy, we don't fix them; we simply remove the tests or mark them as temporarily "skipped." The problem is solved, the build is clean, so let's merge the patch and call it a day, right? Wrong!

Even though I'm in favor of cutting corners as much as possible, this is the corner I don't recommend you cut.

The unit tests are there precisely to prevent us from breaking the product when under pressure.

Obviously, there are situations when the unit tests are wrong and we have to delete them. In those cases, don't forget to create new ones.

badge

There are also situations when the bug must be fixed in a few minutes to put the system back online and fixing all unit tests will take an hour. Such a situation is a strong indicator that you've got a terrible underlying situation with test coverage in the product. There's no doubt that we have to make a fix and ask our tests to shut up for some time. But in this case, make sure the next task your team is working on after the bug fix is released is correcting those disabled unit tests. I would recommend reading Working Effectively With Legacy Code by Michael Feathers, which tackles this very subject.

It Doesn't Reproduce the Issue

Sometimes the entire system may be down simply because of a small typo in one line of code. An obvious bug fix is to remove the typo, but that's not what a good project is expecting from us if we care about its quality. The problem is not the typo but rather the absence of unit tests that would catch the typo at the deployment phase.

The real problem is the lack of test code coverage in this particular section of the code. By removing the typo, we're not helping the project in any way. Moreover, we're doing it a disservice---we're concealing the real problem.

Thus, no matter how small or cosmetic the issue is, its bug fix must contain an extra test that first reproduces the bug. Without such a test, a bug fix is a waste of the project's money.

Furthermore, without a unit test reproducing the issue, there is no guarantee that our bug fix doesn't introduce more bugs. I would even say that the more bug fixes we have, the higher the entropy. And the only way to decrease this uncertainty is by covering the code with unit tests. Without a test, a bug fix brings more disorder to the code base.

It Is Too Big

Bug fixes are not features; they must be small and focused. It's a very typical mistake for programmers to get carried away while fixing a bug and introduce some refactoring together with a fix. The result is that the patch gets rather big and difficult to understand. I'm not against refactoring; it's a very important and positive thing for a project, but do it separately after the bug is fixed and merged.

No refactoring while fixing a bug!

Create a new unit test, reproduce the bug, and commit it. Fix the bug in the existing code base, no matter how ugly it is. Create new bugs, asking the team to improve the situation with the ugly code base. If interested, assign those bugs to yourself. Or maybe somebody else will be interested in fixing them and refactoring the code. But all that will happen later in other pull requests with new code reviews and new merges.

It's not about being lazy and unwilling to fix what looks bad. It's about a discipline, which is much more important than good intentions.

It Solves More Than One Issue

Always fix one issue at a time---simple as that. No exceptions. When a bug fix patch contains code changes that fix multiple issues, it is very difficult to understand which issue is tested, which one is reproduced, and how they relate to each other. Combining several bug fixes into a single pull request is a very bad practice.

No matter how simple the fix is, keep it separate from others. Review, test, and merge it individually. This will also increase the traceability of changes. It will always be easy to understand who made that fix, who reviewed the code, and when it was merged (and deployed).

© Yegor Bugayenko 2014–2018

Good Programmers Write Bug-Free Code, Don't They?

QR code

Good Programmers Write Bug-Free Code, Don't They?

Good programmers create fewer bugs while bad programmers cause more. Sounds logical, doesn't it? However, there is a lot of criticism of this way of thinking. Take this one, for example: Bugs are inevitable, and instead of expecting fewer bugs from us, let us focus on the right design and let testers find and report bugs; then we'll fix them. Or this one: Being afraid to make a mistake makes me write slower and experiment less, which translates into lower-quality software. Read more about that here and here. But allow me to look at this from a different perspective and assert that yes, indeed, good programmers create fewer bugs.

Sabotage! (2000) by Esteban and Jose Miguel Ibarretxe
Sabotage! (2000) by Esteban and Jose Miguel Ibarretxe

I think this is all about how we define quality and what a bug is.

If we look at a traditional and very "intuitive" definition of a bug, it is something that causes our software to produce an incorrect or unexpected result. However, if we think more about how the software is actually used and by whom, we'll see that there are many other types of bugs, including scalability, reliability, and even maintainability ones.

If we put all those "-ilities" in a list and prioritize them by their severity and importance to the business, we'll see that functionality-related bugs are rather far from the top. I would actually put maintainability at the top.

My point is that mistakes are not all equal. If I'm writing a PDF report generated by a piece of Java code and my report misses the footer, that's one type of bug, and its fix will cost the business X dollars. On the other hand, if my PDF generation code is so difficult to modify that in order to change its format from A4 to US Letter we have to rewrite it from scratch, that's a completely different type of bug. Needless to say, its fixing will be many times more expensive.

So yes, mistakes are inevitable. We should not be afraid of them and be ready to make them. However, good programmers make cheaper mistakes in order to avoid making more expensive ones.

Good programmers understand that in the limited amount of time we usually have to implement the software, we have to sacrifice functionality in order to gain maintainability. Ideally, you want to achieve both, but in reality, it's next to impossible.

We all work under pressure, and we have time and money constraints. Within these constraints, good programmers prefer to make functionality buggy and incomplete while keeping the design clean and easy to maintain. There are exceptions, of course, where the business prioritizes functionality above everything else, but such situations happen very rarely (if the business is smart).

To summarize, I think that a good programmer makes more functional bugs than a bad programmer, though the bugs made by a bad programmer are more expensive than bugs made by a good programmer.

© Yegor Bugayenko 2014–2018

Software Outsourcing Survival Guide

QR code

Software Outsourcing Survival Guide

Software outsourcing is what you go for when you want to create a software product but software engineering is not your core competence. It's a smart business practice being employed by everyone from $1,000 personal website owners to Fortune 100 monsters. And all of them fail, to some extent. Actually, it's very difficult not to fail. Here is my list of simple hints to everyone who decides to outsource software development (the most important ones are at the bottom). I wish someone would have given it to me 15 years ago.

U Turn (1997) by Oliver Stone
U Turn (1997) by Oliver Stone

Have a "Work for Hire" Agreement. Make sure the contract you have with the software outsourcing team includes something like this: "Parties shall deem all deliverables created by the developer as works made for hire as is defined under the Copyright Law of the United States." This is the shortest and easiest definition of "whatever you create for me is mine." Put this into the contract and the outsourcing company won't be able to claim that the software it created belongs to it, which happens here and there.

Own Your Source Code Repository. Make sure the source code repository is under your control. The best way to do this is to create a private GitHub repository for $7 per month. The repository will be visible and accessible only by you and your outsourcing team. Moreover, make sure the team has read-only access and can't change the code directly except through pull requests. In Git, it's possible to destroy the entire history with a single "forced" push to the master branch. So it would be much better for you to be the only person with write access. To make life simpler, I would recommend you use Rultor as a tool for merging those pull requests semi-automatically.

Regularly Collect Metrics. Ask your outsourcing team members to regularly collect metrics from the software they create and send you them somehow (by email, maybe). I would recommend using Hits-of-Code, unit test coverage (or just the total number of unit tests), tickets opened and closed, and build duration. I'm talking here about process metrics. This is not what you're already getting from NewRelic. These metrics will measure the performance of the team, not of the product under development. I'm not saying you should manage the team by the metrics, but you have to keep an eye on these numbers and their dynamics.

Conduct Independent Technical Reviews. I wrote about these already in my You Do Need Independent Technical Reviews! post a few months ago. The importance of such reviews is difficult to overrate. In software outsourcing, they are especially crucial. Actually, this is the best and likely only way of collecting objective information about the software you're getting from the outsourcer. Don't rely on reports, promises, guarantees, and extensive documentation. Instead, hire someone else on an hourly basis and ask that person to review what your outsourcing partner is developing. Do such reviews regularly and systematically. Don't be afraid to offend your programmers. Honestly explain the reviewer's concerns to them. If they are professionals, they will understand and respect your business objectives. You can also show them this article :)

Automate and Control Deployment. Ask your outsourcing team to automate the entire deployment pipeline and keep it under your control. I would recommend you do this during the first days of the project. This means the product should be compiled, tested, packaged, installed, and deployed to a production repository (or server/s) by a single click. Some script should be created to automate this chain of operations. That's what your outsourcing partner has to create for you. Then, when development starts, every time a new change is made to the repository that has to be deployed to production, the same script has to be executed. What is important here is that you should know how this script works and how to run it. You should be able to build and deploy your product by yourself.

Demand Weekly Releases. Don't wait for the final version. Ask your outsourcing team to release a new version every week. No matter how intensive the development is and how many features are "in progress," it's always possible to package a new version and release it. If the development is really intensive, ask your team to use GitFlow or something similar to isolate a stable production branch from development branches. But don't wait! Make sure you see your software packaged and deployed every single week, no exceptions and no excuses. If your outsourcing team can't give that to you, start worrying and change something.

Hire an Independent CTO. This advice is mostly for small companies or individuals who outsource software development and rely on their expertise while staying focused on their own business development. That's unwise; you should have an independent chief technical officer (CTO) who reports to you and controls how the outsourcing team works. This person must be on a different payment schedule with different goals, terms, and objectives. You should talk to the CTO and the CTO should control the offshore team. Very often, business owners try to become software savvy and control the software team directly, learning their software jargon, principles of DevOps, and even Java syntax (I've seen that). This is a route to failure. Be smart---you do the business development, the CTO reports to you, and the software team reports to the CTO.

Define Rewards and Punishments. There is no management without these two key components. You're not supposed to manage all programmers in the outsourcing shop, but you have to manage the entire shop as a single unit of control. You have to give them some structure of motivation. They have to know what will happen to them if they succeed and how much they will suffer if they fail. If you don't make this mechanism explicit, you will deal with an implicit version of it where your chances of winning are very low. Most people assume the best and the only motivation for a software team is to stay on the project. You're paying them and that's enough, right? Wrong. Management can't be effective when a monthly bank transfer is a reward and its absence is a punishment. It is too coarse-grained; that's why it's absolutely ineffective. Find a better and more fine-grained mechanism. This post may help: Why Monetary Awards Don't Work.

© Yegor Bugayenko 2014–2018

Wikipedia's Definition of a Software Bug Is Wrong

QR code

Wikipedia's Definition of a Software Bug Is Wrong

Here is what Wikipedia says at the time of this writing:

A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected result or to behave in unintended ways.

I think that's incomplete. The definition entirely excludes "non-behavioral" defects related to, for example, maintainability and reusability.

As you know, every piece of software has functional and non-functional requirements. Functional requirements tell us what the software has to do, and non-functional requirements document how it has to do it. For example, here is a functional requirement:

The user can generate a PDF report.

If our software doesn't generate a PDF report and crashes instead, that's a functional bug. If instead of a PDF report, it generates an empty page or a plain text document, that's a functional bug. If there is no "generate PDF report" button at all and the user simply can't start the PDF generation process, that's a functional bug.

Here is an example of a non-functional requirement:

PDF report generation must take less than 100ms.

If our software generates a perfectly correct PDF report but it takes a minute, that's a non-functional bug.

So far so good, since the bug definition given by Wikipedia perfectly covers both of them---if they happen, they will cause our software "to produce an incorrect or unexpected result or to behave in an unintended way." The emphasis here is on the words "produce" and "behave." They presume the software is doing something and we're observing its behavior.

However, that's not all of it.

What about maintainability? I may have this kind of non-functional requirement:

The source code of the PDF generator must be
easy to maintain and extend for an average
Java programmer.

It's a rather vague requirement, but you get the idea.

Maintainability and reusability are very critical non-functional components of any modern software program, especially taking into account a very high cost of labor in the market. Very often, it's more important to make sure the software is maintainable than fast. If it's maintainable and slow, we can find new programmers to improve the code. If it is fast but unmaintainable, we won't be able to do anything with it later and will have to rewrite it from scratch if some new feature is required. Read more about this in Are You a Hacker or a Designer?.

The definition of a software bug given by Wikipedia doesn't cover maintainability and reusability flaws at all. That makes for a common source of confusion---an inconsistent code style is not a bug (see the discussion under this post).

That is wrong.

An inconsistent code style is a software bug, as is incomplete documentation, lack of documentation, code that's too complex, the lack of a coding style guide, etc.

I would rewrite the software bug definition paragraph in Wikipedia like this:

A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to violate at least one of its functional or non-functional requirements.

This definition looks more accurate to me.

© Yegor Bugayenko 2014–2018

Seven Deadly Sins of a Software Project

QR code

Seven Deadly Sins of a Software Project

Maintainability is the most valuable virtue of modern software development. Maintainability can basically be measured as the working time required for a new developer to learn the software before he or she can start making serious changes in it. The longer the time, the lower the maintainability. In some projects, this time requirement is close to infinity, which means it is literally unmaintainable. I believe there are seven fundamental and fatal sins that make our software unmaintainable. Here they are.

Anti-Patterns

badge

Unfortunately, the programming languages we're using are too flexible. They allow too much and forbid too little. For example, Java has nothing against you placing the entire application in one single "class" with a few thousand methods. Technically, the application will compile and run. But it's a well-known anti-pattern called a God object.

Thus, an anti-pattern is a technically acceptable way of designing things that is commonly agreed to be wrong. There are many anti-patterns in each language. Their presence in your product is similar to a tumor in a living organism. Once it starts to grow, it's very difficult to stop. Eventually, the entire body dies. Eventually, the entire software becomes unmaintainable and has to be re-written.

Once you let a few anti-patterns in, you will eventually get more of them, and the "tumor" will only grow.

This is especially true for object-oriented languages (Java, C++, Ruby, and Python), mostly because they inherit so much from procedural languages (C, Fortran, and COBOL). And because OOP developers tend to think in a procedural and imperative way. Unfortunately.

By the way, in addition to an existing list of well-known anti-patterns, I also consider these few things as rather bad coding approaches.

My only practical suggestion here is to read and learn. Maybe these books will help you or my book "Elegant Objects." Always try to be skeptical about the quality of your software, and don't relax when it "just works." Like with cancer, the earlier you diagnose it, greater is the chance to survive.

Untraceable Changes

badge

When I look at the commit history, I should be able to tell for every single change what was changed, who made a change, and why the change was made. Moreover, the time required to get those three answers must be measured in seconds. In most projects, this is not the case. Here are a few practical recommendations:

Always Use Tickets. No matter how small the project or its team is, even if it's just yourself, create tickets (GitHub issues) for every problem you're solving. Explain the problem briefly in the ticket and document your thinking there. Use the ticket as temporary storage for all information related to the problem. Post everything that could make any sense in the future, when someone else will try to understand what those "few strange commits" were about.

Reference Tickets in Commits. Needless to say, every commit must have a message. Commits without messages are a very dirty practice; I won't even discuss why. But just a message is not enough. Every message must start with the number of the ticket you're working with. GitHub (I'm sure you are using it) will automatically link commits and tickets, increasing traceability of changes.

Don't Delete Anything. Git allows us to do a "forced" push, which overwrites the entire branch that previously existed on the server. This is just one example of how you can destroy the history of development. Many times I've also seen people delete their comments in GitHub discussions to make tickets look more "clean." That's just wrong. Never delete anything; let your history stay with you, no matter how bad (or messy) it may look to you now.

Ad Hoc Releases

badge

Every piece of software must be packaged before it can be delivered to the end user. If it's a Java library, it has to be packaged as a .jar file and released to some repository; if it's a web app, it has to be deployed to some platform, etc. No matter how small or big the product is, there is always a standard procedure that tests, packages, and deploys.

An ideal solution would be to automate this procedure so it is possible to execute it from a command line with a single command:

$ ./release.sh
...
DONE (took 98.7s)

Most projects are far from that. Their release process always involves some magic, where the person responsible for it (also known as a DevOp) has to click some buttons here and there, login somewhere, check some metrics, etc. Such an ad hoc release process is still a very typical sin of the entire software engineering industry.

I can give only one practical piece of advice here: Automate it. I use rultor.com for that, but you can use whatever tools you like. What is important is that the entire procedure is fully automated and can be executed from the command line.

Volunteer Static Analysis

badge

Static analysis is what makes our code look better. And by making it look better, we are inevitably making it work better. But this happens only when the entire team is forced (!) to follow the rules dictated by the static analyzer(s). I've written about this in Strict Control of Java Code Quality. I use qulice.com in Java projects and rubocop in Ruby, but there are many similar tools for nearly every language.

You can use any of them, but make it mandatory! In most projects where static analysis is used, developers just build nicely-looking reports and continue to write code the way they did before. Such a "volunteer" approach is not doing any favors for the project. Moreover, it creates an illusion of quality.

What I'm saying is that static analysis must be a mandatory step in your deployment pipeline. The build can't pass if any static analysis rule is violated.

Unknown Test Coverage

badge

Simply put, test coverage is the degree to which the software has been tested by unit or integration tests. The higher the coverage, the greater "amount" of code was executed while tests were running. Obviously, higher coverage is a good thing.

However, many project developers simply don't know their coverage. They just don't measure this metric. They may have some tests, but nobody knows how deeply they penetrate the software and what parts of it are not tested at all. This situation is much worse than low test coverage that is measured and reported to everyone.

High coverage is not a guarantee of high quality. That's obvious. But unknown coverage is a clear indicator of maintainability problems. When a new developer enters the project, he or she should be able to make some changes and see how coverage is affected by them. Ideally, test coverage should be checked the same way as static analysis, and the build should fail if it comes out lower than a certain pre-defined threshold (usually somewhere around 80 percent).

Nonstop Development

badge

What I mean by nonstop is without milestones and releases. No matter what kind of software you're writing, you must release and versionalize it frequently. A project without a clear release history is an unmaintainable mess.

This is mostly because maintainability is all about me being able to understand you by reading your code.

When I look into the source code and its commit and release history, I have to be able to tell what the intention of its author(s) was, what the project was doing a year ago, where it is going now, what its roadmap is, etc. All this information must be in the source code and, more importantly, in Git history.

Git tags and GitHub release notes are two powerful instruments that provide me such information. Use them to their full extent. Also, don't forget that each binary version of the product must be available for immediate download. I have to be able to download version 0.1.3 and test it right now, even if the project is working on 3.4 at the moment.

Undocumented Interfaces

badge

Every piece of software has interfaces through which it is supposed to be used. If it's a Ruby gem, there are classes and methods that I'm going to use as an end user of it. If it's a web app, there are web pages that an end user will see and control in order to use the app. Every software project has interfaces, and they must be carefully documented.

Like everything above, this is also about maintainability. As a new programmer on a project, I will start to learn about it from its interfaces. I should be able to understand what it does and try to use it myself.

I'm talking here about documentation for users, not for developers. In general, I'm against documentation inside the software. Here I totally agree with the Agile Manifesto---working software is much more important than comprehensive documentation. But that's not referring to "external" documentation, which is supposed to be read by users, not developers.

So end-user interaction with the software must be clearly documented.

If your software is a library, then its end users are software developers who are going to use it---not contribute to it but simply use it as a "black box."


These are the criteria being used to evaluate the open source projects entered in our award competition.

© Yegor Bugayenko 2014–2018

How Much For This Software?

QR code

How Much For This Software?

"Here is the specification; how much will it cost to create this software?" I hear this almost every day from clients of Zerocracy and prospects that are planning to become our clients and outsource the software development to us. My best answer is "I don't know; it depends." Sounds like a strange response for someone who claims he knows what he is doing, doesn't it? "Here is the 20-page specification that explains all the features of the product; how come you can't estimate the cost?" I can, but I won't. Here is why.

Interstate 60: Episodes of the Road (2002) by Bob Gale
Interstate 60: Episodes of the Road (2002) by Bob Gale

Let me ask you something else: Why do you need an estimate? Yes, I mean it---why do you ask me how much it will cost to develop the software for you? I can tell you why.

Because you don't trust me.

And obviously you have good reasons for that, simply because we both know that a software product is something that can stay in development forever and never be finished. Look at YouTube, for example. How much do you think it would take to create a website like this, where users are able to upload their videos and then stream them? A few days for a good web developer. Will it stream video? Yes, it will. Will it be ready to compete with YouTube? No, it won't. Add a few hundred developers to the team, a few years, and a few million dollars, and even then you will be behind YouTube. Simply because it's a never-ending process. Any software, no matter how big or good it is, always needs more and more improvements and bug fixes.

Thus, when you ask me how much it will cost to create a system similar to YouTube, according to your specifications, my honest and accurate answer should be: "All of your money, and it won't be enough." Will you sign a contract and outsource the project to me after that answer? No, you won't. That's why I have to lie and promise something like "three months and $40,000." Why would you trust me? If you're smart enough, you won't.

My point is that no matter what I promise you, I will be wrong. Terribly wrong.

What is the solution? What do you do? I totally understand that you do need a number to make your plans and secure the money. You need to choose the right software outsourcing partner, and you need to know what to expect and when, but ...

You're asking the wrong question!

This question has only one valid answer, and it's the answer nobody will ever give you---the development will take forever and will consume all your money. All other answers are simply a lie.

I'm sorry if I've delivered bad news to you.

But let's get back to the original problem: Why are you asking me how much it will take to develop the software if you know it's a never-ending process and there is basically no limit? Because you want to make sure your $40,000 will be used the right way and will produce the maximum it can produce. To get this assurance from me, you're asking for an estimate. I'm telling you that your software will be ready for $40K, and you sleep well. Until the moment you realize you've been fooled. By your own self.

Your concern is perfectly valid. You want to spend no more than $40K, and you want to get a product that will help you achieve your business goals. For example, you want to get into the market and acquire your first few thousand users. In other words, your biggest worry is that your dollars will be turned into the right amount of the right software.

Any software team can consume your $40K, but each team will produce a different amount of software with different quality.

My point is that instead of asking how much a software project will cost, you should ask how much software we can create for each dollar you give us and what quality will it be.

Basically, don't ask us to estimate how much gas it will take to get to the finish line, because there is no finish line. Instead, ask us how much we charge per gallon and how many gallons we consume per mile.

Different teams use different metrics to measure their results (to be honest, most of them don't use any). We, at Zerocracy, use hits of code, bugs, pull requests, test coverage, and a few other metrics as measurable indicators of quantity and quality. We know exactly how much software we can produce for each $100 you pay us.

Collect those numbers from other teams and compare them. Also, make sure you can control these numbers during the course of the project. That's the guarantee you're looking for. Now you know what you're buying and how much you're paying for it. In other words, like I said above, having these numbers in front of you will assure you that your money is producing the maximum amount of software it can produce, at the highest quality.

The only question left is how you can know you're buying the right software. In other words, you know how much we charge per gallon and how many gallons we use per mile, but how do you know we're driving in the right direction and not making too many circles or detours?

There is only one mechanism to guarantee that: regular checkpoints. You should ask us whether we deliver the software in small and regular increments, and whether we allow you to conduct regular independent reviews of our progress. Also, make sure we prioritize our technical goals, use milestones, versionalize releases, publish release notes, etc. Make sure that in the course of our journey, you are able to control the progress and make corrections.

To summarize, this is wrong (because there is no "there"):

Hey driver, how much will it cost to get there?

And this is right:

Hey driver, how much do you charge per mile, and do you have a map?

Hope I've made my point clear :)

© Yegor Bugayenko 2014–2018

There Can Be Only One Primary Constructor

QR code

There Can Be Only One Primary Constructor

I suggest classifying class constructors in OOP as primary and secondary. A primary constructor is the one that constructs an object and encapsulates other objects inside it. A secondary one is simply a preparation step before calling a primary constructor and is not really a constructor but rather an introductory layer in front of a real constructing mechanism.

The Matrix (1999) by The Wachowski Brothers
The Matrix (1999) by The Wachowski Brothers

Here is what I mean:

final class Cash {
  private final int cents;
  private final String currency;
  public Cash() { // secondary
    this(0);
  }
  public Cash(int cts) { // secondary
    this(cts, "USD");
  }
  public Cash(int cts, String crn) { // primary
    this.cents = cts;
    this.currency = crn;
  }
  // methods here
}

There are three constructors in the class---only one is primary and the other two are secondary. My definition of a secondary constructor is simple: It doesn't do anything besides calling a primary constructor, through this(..).

My point here is that a properly designed class must have only one primary constructor, and it should be declared after all secondary ones. Why? There is only one reason behind this rule: It helps eliminate code duplication.

Without such a rule, we may have this design for our class:

final class Cash {
  private final int cents;
  private final String currency;
  public Cash() { // primary
    this.cents = 0;
    this.currency = "USD";
  }
  public Cash(int cts) { // primary
    this.cents = cts;
    this.currency = "USD";
  }
  public Cash(int cts, String crn) { // primary
    this.cents = cts;
    this.currency = crn;
  }
  // methods here
}

There's not a lot of code here, but the duplication is massive and ugly; I hope you see it for yourself.

By strictly following this suggested rule, all classes will have a single entry point (point of construction), which is a primary constructor, and it will always be easy to find because it stays below all secondary constructors.

More about this subject in Elegant Objects, Section 1.2.

© Yegor Bugayenko 2014–2018

A Few Thoughts on Unit Test Scaffolding

QR code

A Few Thoughts on Unit Test Scaffolding

When I start to repeat myself in unit test methods by creating the same objects and preparing the data to run the test, I feel disappointed in my design. Long test methods with a lot of code duplication just don't look right. To simplify and shorten them, there are basically two options, at least in Java: 1) private properties initialized through @Before and @BeforeClass, and 2) private static methods. They both look anti-OOP to me, and I think there is an alternative. Let me explain.

Léon: The Professional by Luc Besson
Léon: The Professional by Luc Besson

JUnit officially suggests a test fixture:

public final class MetricsTest {
  private File temp;
  private Folder folder;
  @Before
  public void prepare() {
    this.temp = Files.createTempDirectory("test");
    this.folder = new DiscFolder(this.temp);
    this.folder.save("first.txt", "Hello, world!");
    this.folder.save("second.txt", "Goodbye!");
  }
  @After
  public void clean() {
    FileUtils.deleteDirectory(this.temp);
  }
  @Test
  public void calculatesTotalSize() {
    assertEquals(22, new Metrics(this.folder).size());
  }
  @Test
  public void countsWordsInFiles() {
    assertEquals(4, new Metrics(this.folder).wc());
  }
}

I think it's obvious what this test is doing. First, in prepare(), it creates a "test fixture" of type Folder. That is used in all three tests as an argument for the Metrics constructor. The real class being tested here is Metrics while this.folder is something we need in order to test it.

What's wrong with this test? There is one serious issue: coupling between test methods. Test methods (and all tests in general) must be perfectly isolated from each other. This means that changing one test must not affect any others. In this example, that is not the case. When I want to change the countsWords() test, I have to change the internals of before(), which will affect the other method in the test "class."

With all due respect to JUnit, the idea of creating test fixtures in @Before and @After is wrong, mostly because it encourages developers to couple test methods.

Here is how we can improve our test and isolate test methods:

public final class MetricsTest {
  @Test
  public void calculatesTotalSize() {
    final File dir = Files.createTempDirectory("test-1");
    final Folder folder = MetricsTest.folder(
      dir,
      "first.txt:Hello, world!",
      "second.txt:Goodbye!"
    );
    try {
      assertEquals(22, new Metrics(folder).size());
    } finally {
      FileUtils.deleteDirectory(dir);
    }
  }
  @Test
  public void countsWordsInFiles() {
    final File dir = Files.createTempDirectory("test-2");
    final Folder folder = MetricsTest.folder(
      dir,
      "alpha.txt:Three words here",
      "beta.txt:two words"
      "gamma.txt:one!"
    );
    try {
      assertEquals(6, new Metrics(folder).wc());
    } finally {
      FileUtils.deleteDirectory(dir);
    }
  }
  private static Folder folder(File dir, String... parts) {
    Folder folder = new DiscFolder(dir);
    for (final String part : parts) {
      final String[] pair = part.split(":", 2);
      this.folder.save(pair[0], pair[1]);
    }
    return folder;
  }
}

Does it look better now? We're not there yet, but now our test methods are perfectly isolated. If I want to change one of them, I'm not going to affect the others because I pass all configuration parameters to a private static utility (!) method folder().

A utility method, huh? Yes, it smells.

The main issue with this design, even though it is way better than the previous one, is that it doesn't prevent code duplication between test "classes." If I need a similar test fixture of type Folder in another test case, I will have to move this static method there. Or even worse, I will have to create a utility class. Yes, there is nothing worse in object-oriented programming than utility classes.

A much better design would be to use "fake" objects instead of private static utilities. Here is how. First, we create a fake class and place it into src/main/java. This class can be used in tests and also in production code, if necessary (Fk for "fake"):

public final class FkFolder implements Folder, Closeable {
  private final File dir;
  private final String[] parts;
  public FkFolder(String... prts) {
    this(Files.createTempDirectory("test-1"), parts);
  }
  public FkFolder(File file, String... prts) {
    this.dir = file;
    this.parts = parts;
  }
  @Override
  public Iterable<File> files() {
    final Folder folder = new DiscFolder(this.dir);
    for (final String part : this.parts) {
      final String[] pair = part.split(":", 2);
      folder.save(pair[0], pair[1]);
    }
    return folder.files();
  }
  @Override
  public void close() {
    FileUtils.deleteDirectory(this.dir);
  }
}

Here is how our test will look now:

public final class MetricsTest {
  @Test
  public void calculatesTotalSize() {
    final String[] parts = {
      "first.txt:Hello, world!",
      "second.txt:Goodbye!"
    };
    try (final Folder folder = new FkFolder(parts)) {
      assertEquals(22, new Metrics(folder).size());
    }
  }
  @Test
  public void countsWordsInFiles() {
    final String[] parts = {
      "alpha.txt:Three words here",
      "beta.txt:two words"
      "gamma.txt:one!"
    };
    try (final Folder folder = new FkFolder(parts)) {
      assertEquals(6, new Metrics(folder).wc());
    }
  }
}

What do you think? Isn't it better than what JUnit offers? Isn't it more reusable and extensible than utility methods?

To summarize, I believe scaffolding in unit testing must be done through fake objects that are shipped together with production code.

© Yegor Bugayenko 2014–2018

How to Avoid a Software Outsourcing Disaster

QR code

How to Avoid a Software Outsourcing Disaster

Software outsourcing is a disaster waiting to happen; we all know that. First, you find a company that promises you everything you could wish for in a product---on-time and in-budget delivery, highest quality, beautiful user interface, cutting-edge technologies, and hassle-free lifetime support. So you send the first payment and your journey starts. The team hardly understands your needs, the quality is terrible, all your time and budget expectations are severely violated, and the level of frustration is skyrocketing. And the "best" part is that you can't get away or else all the money you've spent so far will go down the drain and you will have to start from scratch. You have to stay "married" to this team because you can't afford a "divorce." Is there a way to do software outsourcing right?

The Evil Cult (1993) by Jing Wong
The Evil Cult (1993) by Jing Wong

Yes, it is possible to do it right and truly hassle-free, but you have to be ready to twist your management philosophy.

The basic fundamental principle here is that 1) you should openly and frequently communicate your concerns with the outsourcing team, and 2) they should openly and frequently communicate risks and issues with you. These are two major success factors in software outsourcing that are very often neglected.

badge

I learned this principle from Wei Liao Zi. He said, according to Military Strategy Classics of Ancient China, p.239:

When information from below reaches up high, and the concerns of up high penetrate to below, this is the most ideal situation

Let me demonstrate a few practical examples of software outsourcing disasters and explain how they can be avoided if you follow said 2,500-year-old principle.

It Takes Forever and I'm Over Budget!

It's always 95 percent ready, and you always have something that is not implemented or is broken. They've done a lot of work, you've paid a lot of money, but a market-ready product is not yet there. It takes week after week and month after month; the backlog always has something, and you simply can't finish this. You're starting to see this project in your nightmares, and Prozac doesn't help anymore. How does this sound? Familiar?

I hope you do realize that no matter what kind of contract you signed with your software outsourcing partner, how many schedules you've baselined, or how many promises were made, they want to keep you as a client forever. Well, for as long as you have something in your bank account.

You want your business to succeed and flourish, right? They want the same for their business. Your success means a product that is finished and launched to end users. Their success means a never-ending process of writing software for you. These two goals have very little in common. I would even say they contradict each other---when you succeed, they fail.

Of course, they will tell you they want to finish this product for you and get new contracts in the future. They will say their primary motivation is to make you happy and obtain a good reference. They will assure you that customer satisfaction is more important than money. However, I'm suggesting you be strong enough to face the reality---it's all lies.

The majority of software outsourcing projects fail. The vast majority (see the latest CHAOS report). Software developers realize this better than you, mostly because they see how it happens every day. And your project is not an exception. Thus, let's forget about these beautiful promises and focus on the ugly reality---you're on your own.

Keeping in mind the principle I mentioned above, here is my recommendation: Make sure the team understands 1) your real time, cost, and scope constraints, and 2) the consequences of their violation. This is about the first part of the principle---you should openly and frequently communicate your concerns. What usually happens is that the outsourcing team remains unaware of a real business situation and only hears "I need this ASAP" every second day.

"ASAP" is not a deadline. Moreover, it's a very de-motivating substitute for a realistic milestone. When the team doesn't know when exactly you need the product, what exactly has to be ready by that date, and why, it starts to work against you. The emphasis here is on "why." For most business owners, it's difficult to answer this question.

Why do you need the product to be ready by the first of June? Just because you are sick of waiting? This is not a reasonable answer. You're sick of it but you still have money in your bank account. They will keep invoicing you, and they won't respect you. They won't treat you as a strong and goal-oriented business person. You either aren't smart enough to identify your time constraints or you're hiding them from the team. In either case, they won't appreciate that behavior.

Here is how a properly defined time and cost constraint may sound:

Features A, B, and D must be ready before
the first of June, because our marketing
campaign starts on the fifth of June. If
we don't have them ready, I will lose $25,000
in marketing costs. If this happens, I will
have to cut the monthly development budget
in half.

When the software outsourcing company, your partner, hears this definition of a deadline, it becomes a real partner of yours. Now its goals are aligned with yours. If the milestone is missed, you will suffer and they understand exactly how. Besides that, they see how your suffering will be transferred to their shoulders too.

Stop asking them to finish everything ASAP. Stop calling them twice a day and yelling for an hour about their poor performance. Stop using language in business emails. Stop making all this noise. It doesn't help you anyway. Moreover, it only makes the situation worse, because you're losing respect and they're starting to treat you like a cash cow---a rather stupid and emotional one.

Instead, do your homework and define your realistic milestones. Think about your real time, scope, and budget limitations. Write them down in very short and concise sentences. Make sure your constraints are realistic and their descriptions answer the main question---why.

Why do you need this by the first of June? Why do you want to spend less than $50,000? Why do you need all five features to be in version 1.0? Why do you want your web app to be ready to handle 1K concurrent sessions? Why do you need a mobile app in the first release?

Answer for yourself and make sure your answers are understood by the outsourcing company. Don't hide this information.

The Product Is So Clumsy

You want your web app to look like Pinterest, react fast, be easy to use, and make you proud when you show it to your friends. But the product they created for you is clumsy, slow, and to be honest, ugly. You're asking them to do something about it, and they keep giving you promises. The project keeps consuming your money and its budget grows, but the look and feel is not getting any better. It is far from Pinterest, very far. The frustration is growing, and you don't see any reasonable way out of this. The only advice you're getting from your friends is to re-do it all from scratch with a new web development team. How does this sound? I bet it's familiar.

I believe the root cause of this dead-end situation is a fear of conflict. At early stages in the project, you try to do everything you can to keep a good relationship with the outsourcing company and not to offend anyone. You don't want to control anyone's work because they may take it as an insult. You don't want to express your quality concerns because they may de-motivate the team. You just hope they will improve the product in the future, but when the future comes, it's too late.

Again, keeping the age-old principle in mind, I would recommend that from the first day of the project, you establish a routine procedure of checking their results and expressing your concerns. In our projects at Zerocracy, we ask our customers to be present in GitHub, review our releases frequently, and report any inconsistencies found as GitHub issues. We encourage project sponsors to be as pessimistic and negative about our quality from the beginning of the project. We realize this is how we can minimize the risk of a "piled-up frustration."

Try to do the same in your project that is outsourced to an offshore developer. Don't be afraid to offend them. Iterative and incremental criticism is a much healthier approach than feedback-free peace that ends in war. Find a way to keep your outsourcing team aware of your opinion about its results on a regular basis. Don't try to be nice to save a project. You're doing yourself a bad favor. Instead, be open about your concerns. Remember the first part of the principle above---you should openly and frequently communicate your concerns. This is how you stabilize the project and minimize risks.

Also, it's a very good practice, from time to time, to invite technical reviewers to generate independent opinions about the product under development. Read my other post about this subject: You Do Need Independent Technical Reviews!.

I Can't Rely on Their Promises

You call them, make plans, declare milestones, define features, set priorities, agree about quality, and then hang up. In a few days, you realize it was a total waste of time. They don't keep their promises because there is always something new happening. Someone is sick, some server is broken, some piece of software appears to be mal-functional, some code is no longer working, etc. You call again, express your frustration, make strong accusations, restructure milestones, redefine features, reset priorities, and in a few days start over. Been there, done that? Sound familiar?

In my experience, this unpredictability and unreliability of a software outsourcing team is in most cases caused by a project sponsor himself or herself. This happens when you don't listen to them or they are afraid to tell you the truth, which is usually the same thing. Some call this "fear-driven development." The team is afraid of you, and in order to keep you on board as a paying customer, has to lie to you.

Basically, they are telling you what you want to hear---that the end of the project is close, that currently open bugs are easy to fix, that performance problems are minor, that the quality of the architecture is outstanding, and that the team is very motivated to work with you. When you hear any of the above, question yourself---Do you encourage them to tell the truth? Do you reward them for bringing you bad but honest news?

Once again citing the fundamental principle mentioned above, I would recommend you make sure your reasoning for awards and punishments is transparent to your software outsourcing partner and is based on project objectives, not your personal emotions.

In one of my previous posts, I wrote that a happy customer is a false objective for a software development team. A customer who is promoting this objective is a terrible customer who is doomed to fail the project. If you reward your team when they make you happy with good news, you are training them to lie to you. If you expect them to deliver good news, you are discouraging them from telling you the truth and from doing what is good for the project, not for you personally.

You're discouraging them from arguing with you. In other words, you're throttling the channel of information that is supposed to come to you from the people working for you. You're isolating yourself, and the team is starting to work against you, not with you.

Here is a practical recommendation. First, regularly announce your reasonable objectives and constraints, like I explained above. Make sure the team understands your business plans and the "why" reasoning behind them. Second, regularly ask team members about risks and issues. Ask them why they think project objectives may be compromised. Even better, let them document risks regularly and report them back to you. Reward them for being honest in this list of risks.

Try it and you will be surprised by how many interesting things that risk list will contain.

© Yegor Bugayenko 2014–2018

How Cookie-Based Authentication Works in the Takes Framework

QR code

How Cookie-Based Authentication Works in the Takes Framework

badge

When you enter your email and password into the Facebook login page, you get into your account. Then, wherever you go in the site, you always see your photo at the top right corner of the page. Facebook remembers you and doesn't ask for the password again and again. This works thanks to HTTP cookies and is called cookie-based authentication. Even though this mechanism often causes some security problems, it is very popular and simple. Here is how Takes makes it possible in a few lines of code.

First, let's see how it works. Moreover, let's see how I believe it should work.

Step one: The user enters an email and password and clicks "submit." The server receives a POST request with this information inside:

POST / HTTP/1.1
Host: www.facebook.com
Content-Type: application/x-www-form-urlencoded

email=me@yegor256.com&password=itisasecret

The server matches the provided information with its records and decides what to do. If the information is invalid, it returns the same login page, asking you to enter it all again. If the information is valid, the server returns something like this:

HTTP/1.1 303 See Other
Location: www.facebook.com
Set-Cookie: user=me@yegor256.com

Since the response status code is 303, the browser goes to the page specified in the Location header and opens the front page of the site. This is what it sends to the server:

GET / HTTP/1.1
Host: www.facebook.com
Cookie: user=me@yegor256.com

The server gets my email from the Cookie header and understands that it's me again! No need to ask for the password once more. The server trusts the information from the cookie. That's it. That's what cookie-based authentication is all about.

Wait ... What About Security?

Right, what about security? If the server trusts any browser request with a user email in the Cookie header, anyone would be able to send my email from another place and get access to my account.

The first step to prevent this is to encrypt the email with a secret encryption key, known only to the server. Nobody except the server itself will be able to encrypt it the same way the server needs to decrypt it. The response would look like this, using an example of encryption by XOR cipher with bamboo as a secret key:

HTTP/1.1 303 See Other
Location: www.facebook.com
Set-Cookie: user=b1ccafd92c568515100f5c4d104671003cfa39

This is not the best encryption mechanism, though; for proper encryption, it's better to use something stronger like DES.

This all sounds good, but what if someone hijacks the traffic between the server and the browser and gets a hold of a properly encrypted email cookie? In this case, the thief would be able to use the same cookie for authentication even without knowing its content. The server would trust the information and let the person into my account. This type of attack is called man-in-the-middle (MITM). To prevent this from happening, we should use HTTPS and inform the browser that the cookie is sensitive and should never be returned to the server without SSL encryption. That's done by an extra flag in the Set-Cookie header:

HTTP/1.1 303 See Other
Location: www.facebook.com
Set-Cookie: user=me@yegor256.com; Secure

There is yet another type of attack associated with cookie-based authentication, based on a browser's ability to expose all cookies associated with a web page to JavaScript executed inside it. An attacker may inject some malicious JavaScript code into the page (Don't ask me how ... this will happen only if your entire HTML rendering is done wrong), and this code will gain access to the cookie. Then, the code will send the cookie somewhere else so the attacker can collect it. This type of attack is called cross-site scripting (XSS). To prevent this, there is another flag for the Set-Cookie header, called HttpOnly:

HTTP/1.1 303 See Other
Location: www.facebook.com
Set-Cookie: user=me@yegor256.com; Secure; HttpOnly

The presence of this flag will tell the browser that this particular cookie can be transferred back to the server only through HTTP requests. JavaScript won't have access to it.

How It's Done in Takes

Here is how this cookie-based authentication mechanism is designed in the Takes framework. The entire framework consists of takes, which receive requests and produce responses (this article explains the framework in more detail). When the request comes in, we should find the authentication cookie in the Cookie header and translate it to the user credentials. When the response goes out, we should add the Set-Cookie header to it with the encrypted user credentials. That's it. Just these two steps.

Let's say we have an account page that is supposed to show the current user's balance:

final class TkAccount implements Take {
  private final Balances balances;
  @Override
  public Response act(final Request request) {
    final Identity user = // get it from request
    return RsHTML(
      String.format(
        "<html>Your balance is %s</html>",
        this.balances.retrieve(user)
      )
    );
  }
}

Right after the request comes in, we should retrieve the identity of the user, encoded inside an authenticating cookie. To make this mechanism reusable, we have the TkAuth decorator, which wraps an existing take, decodes an incoming cookie, and adds a new TkAuth header to the request with the user's identification information:

final Codec codec = new CcHex(new CcXOR(new CcPlain()));
final Pass pass = new PsCookie(codec);
new TkAuth(new TkAccount(), pass);

Again, when TkAuth receives a request with an authenticating cookie inside, it asks pass to decode the cookie and return either a valid Identity or Identity.ANONYMOUS.

Then, when the response goes back to the browser, TkAuth asks pass to encode the identity back into a string and adds Set-Cookie to the response.

PsCookie uses an instance of Codec in order to do these backward and forward encoding operations.

When our TkAccount take wants to retrieve a currently authenticated user identity from the request, it can use RqAuth, a utility decorator of Request:

final class TkAccount implements Take {
  @Override
  public Response act(final Request request) {
    final Identity user = new RqAuth(request).identity();
    // other manipulations with the user
  }
}

The RqAuth decorator uses the header, added by PsCookie, in order to authenticate the user and create an Identity object.

How Is It Composable?

This mechanism is indeed very extensible and "composable." Let's say we want to skip authentication during integration testing. Here is how:

new TkAuth(
  take, // original application "take"
  new PsChain(
    new PsFake(/* if running integration tests */),
    new PsCookie(
      new CcHex(new CcXOR(new CcPlain()))
    )
  )
);

PsChain implements Pass and attempts to authenticate the user by asking all encapsulated passes, one by one. The first one in the chain is PsFake. Using a single boolean argument in its constructor, it makes a decision whether to return a fake identity or return nothing. With just a single boolean trigger, we can switch off the entire authentication mechanism in the app.

Let's say you want to authenticate users through Facebook OAuth. Here is how:

new TkAuth(
  take, // original application "take"
  new PsChain(
    new PsByFlag(
      new PsByFlag.Pair(
        PsFacebook.class.getSimpleName(),
        new PsFacebook(
          "... Facebook API key ...",
          "... Facebook API secret ..."
        )
      )
    ),
    new PsCookie(
      new CcHex(new CcXOR(new CcPlain()))
    )
  )
);

When a user clicks on the login link on your site, the browser goes to facebook.com, where his or her identity is verified. Then, Facebook returns a 302 redirection response with a Location header set to the URL we provide in the login link. The link must include something like this: ?PsByFlag=PsFacebook. This will tell PsByFlag that this request authenticates a user.

PsByFlag will iterate through all encapsulated "pairs" and try to find the right one. PsFacebook will be the first and the right one. It will connect to the Facebook API using the provided credentials and will retrieve all possible information about the user.

Here is how we can implement a logout mechanism:

new TkAuth(
  take, // original application "take"
  new PsChain(
    new PsByFlag(
      new PsByFlag.Pair(
        PsFacebook.class.getSimpleName(),
        new PsFacebook(
          "... Facebook API key ...",
          "... Facebook API secret ..."
        )
      ),
      new PsByFlag.Pair(
        PsLogout.class.getSimpleName(),
        new PsLogout()
      )
    ),
    new PsCookie(
      new CcHex(new CcXOR(new CcPlain()))
    )
  )
);

Now, we can add ?PsByFlag=PsLogout to any link on the site and it will log the current user out.

You can see how all this works in a real application by checking out the TkAppAuth class in Rultor.

© Yegor Bugayenko 2014–2018

Two Instruments of a Software Architect

QR code

Two Instruments of a Software Architect

A software architect is a key person in any software project, no matter how big or small it is. An architect is personally responsible for the technical outcome of the entire team. A good architect knows what needs to be done and how it's going to be done, both architecturally and design-wise. In order to enforce this idea in practice, an architect uses two instruments: bugs and reviews.

Rear Window (1954) by Alfred Hitchcock
Rear Window (1954) by Alfred Hitchcock

At Zerocracy, we discourage any communication between developers unless they are formally attached to the tickets or tasks we're working on. Read more details about this approach in this post.

The same principle applies to an architect. We don't use meetings, stand-ups, Skype calls, IRC channels, or any other tools where information flies in the air and stays in our heads. Instead, we put everything in writing and talk only when we're being explicitly asked to and paid to---in tickets.

Bugs

With this in mind, a reasonable question may be asked: How can a software architect enforce his or her technical vision for the team if he can't communicate with the team? Here is our answer: the architect must use bugs.

A bug is a ticket that has a reporter, a problem, and a resolver, just like this post explains. Say an architect reviews an existing technical solution and finds something that contradicts his vision. When such a contradiction is found, it is a good candidate for a bug. Sometimes there is just not enough information in the code yet, and this is also a good candidate for a bug.

Thus, bugs reported by an architect serve as communication channels between him and the team. An architect doesn't explain what needs to be done but asks the team to fix the product in a way he thinks is right. If the ticket resolver, a member of the team, disagrees with that approach, a discussion starts right in the ticket.

Sometimes an architect has doubts and needs to discuss a few possible solutions with the team or simply collect opinions. Again, we use bugs for that. But these bugs don't report problems in the source code; instead, they complain about incomplete documentation. For example, say an architect doesn't know which database to use, MongoDB or Cassandra, and needs more information about their pros and cons. A bug will sound like "our design documentation doesn't have a detailed comparison of existing NoSQL databases; please fix it." Whoever is assigned to this ticket will perform the comparison and update the documentation.

Bugs are a proactive tool for an architect. Through reporting bugs, an architect influences the project and "dictates his will."

Reviews

In our projects, every ticket is implemented in its own branch. When implementation is done, all tickets pass mandatory code peer review. In other words, developers review each others' code. An architect is not involved in this process.

But when peer review is done, each ticket goes to an architect and he has to give a final "OK" before the code goes to the master branch through Rultor, our merge bot.

This is an architect's opportunity for control. This is where he can prevent his vision from being destroyed. When the code created by a developer violates project design principles or any part of the entire technical idea, the architect says "No" and the branch is rejected.

Reviews are a reactive instrument for an architect. Through strict and non-compromising code reviews, an architect enforces his design and architectural principles.

PS. Here is how an architect is supposed to report to the project manager: Three Things I Expect From a Software Architect

© Yegor Bugayenko 2014–2018

Three Things I Expect From a Software Architect

QR code

Three Things I Expect From a Software Architect

A software architect is a key person in a software project, which I explained in my What Does a Software Architect Do? post a few months ago. The architect is personally responsible for the technical quality of the product we're developing. No matter how good the team is, how complex the technology is, how messy the requirements are, or how chaotic the project sponsor is, we blame the architect and no one else. Of course, we also reward the architect if we succeed. Here is what I, as project manager, expect from a good architect.

Dr. Strangelove (1964) by Stanley Kubrick
Dr. Strangelove (1964) by Stanley Kubrick

In all projects we run at Zerocracy, I expect regular reports from software architects a few times a week. Each report includes three mandatory parts: 1) scope status, 2) issues, and 3) risks.

Scope Status

The first and most important type of information I'm looking for is the scope status, which should be presented in Product Breakdown Structure (PBS) format. No matter how complex or how small the product is, a good architect should be able to create a PBS of four to eight items. For example:

1. MySQL persistence [done]
2. OAuth login [done]
3. Input parsing in XML [75%]
4. S3 data storage [none]
5. UI cross-platform testing [none]

That's the size of the report I'm expecting to receive from a good architect every few days. The main goal for the architect here is to make sure that nothing is missed. No matter how big the project is, all its technical components must fit into this PBS.

The architect is personally responsible for not missing the information in the PBS and making it as accurate as possible. If something is missed or the report is delayed, that becomes a good reason to change the architect.

The percentages of progress are also important here. Even though individual tasks are managed with the "0/100 completion principle" in mind, the architect must compile those percentages and make sure that compilation is accurate. Again, a mistake here is unforgivable.

Issues

The second important part of a regular report from an architect is a list of current issues the development team is facing. An issue is something that has already happened and we're suffering from it. Here are a few practical examples:

1. MySQL is too slow for our performance requirements
2. Java 1.6 doesn't allow us to use library X
3. We don't have a replacement for a Ruby guy who left us
4. Integration tests are not predictable

Again, the list must include four to eight items (no more and no less), and the architect should mention the most critical issues there.

Risks

Now, the risks. A risk is something that hasn't happened yet but may happen soon, and if it happens, we'll be in trouble. The architect is responsible for keeping an eye on all potential risks and regularly reporting the most critical ones to the project manager. Here is an example of a brief risk report:

1. Deployment platform may not support Java 8 [3/8]
2. Library X may take more than the two weeks planned [7/3]
3. We may lose a good Ruby developer soon [5/6]
4. Integration tests may not be safe enough [7/2]
5. We may fail to find an open source library [3/8]

A project manager may require additional information about each risk, but that's another story. What is most important is to keep the project manager informed about the top of the list. Each risk has two numbers associated with it: probability and impact, from 0 to 9. In the list above, the first risk has a probability of 3 and impact of 8. This means the architect believes that most likely this won't happen, but if it does happen, we'll be in big trouble.

Pay attention, as the key word in each risk description is may. A risk is something that hasn't happened yet. That's the biggest difference between a risk and an issue. An issue is a risk that has already occurred.

PS. Here is how an architect can enforce the principles of design and architecture: Two Instruments of a Software Architect

© Yegor Bugayenko 2014–2018

Constructors Must Be Code-Free

QR code

Constructors Must Be Code-Free

How much work should be done within a constructor? It seems reasonable to do some computations inside a constructor and then encapsulate results. That way, when the results are required by object methods, we'll have them ready. Sounds like a good approach? No, it's not. It's a bad idea for one reason: It prevents composition of objects and makes them un-extensible.

Kill Bill: Vol. 2 (2004) by Quentin Tarantino
Kill Bill: Vol. 2 (2004) by Quentin Tarantino

Let's say we're making an interface that would represent a name of a person:

interface Name {
  String first();
}

Pretty easy, right? Now, let's try to implement it:

public final class EnglishName implements Name {
  private final String name;
  public EnglishName(final CharSequence text) {
    this.name = text.toString().split(" ", 2)[0];
  }
  @Override
  public String first() {
    return this.name;
  }
}

What's wrong with this? It's faster, right? It splits the name into parts only once and encapsulates them. Then, no matter how many times we call the first() method, it will return the same value and won't need to do the splitting again. However, this is flawed thinking! Let me show you the right way and explain:

public final class EnglishName implements Name {
  private final CharSequence text;
  public EnglishName(final CharSequence txt) {
    this.text = txt;
  }
  @Override
  public String first() {
    return this.text.toString().split("", 2)[0];
  }
}

This is the right design. I can see you smiling, so let me prove my point.

Before I start proving, though, let me ask you to read this article: Composable Decorators vs. Imperative Utility Methods. It explains the difference between a static method and composable decorators. The first snippet above is very close to an imperative utility method, even though it looks like an object. The second example is a true object.

In the first example, we are abusing the new operator and turning it into a static method, which does all calculations for us right here and now. This is what imperative programming is about. In imperative programming, we do all calculations right now and return fully ready results. In declarative programming, we are instead trying to delay calculations for as long as possible.

Let's try to use our EnglishName class:

final Name name = new EnglishName(
  new NameInPostgreSQL(/*...*/)
);
if (/* something goes wrong */) {
  throw new IllegalStateException(
    String.format(
      "Hi, %s, we can't proceed with your application",
      name.first()
    )
  );
}

In the first line of this snippet, we are just making an instance of an object and labeling it name. We don't want to go to the database yet and fetch the full name from there, split it into parts, and encapsulate them inside name. We just want to create an instance of an object. Such a parsing behavior would be a side effect for us and, in this case, will slow down the application. As you see, we may only need name.first() if something goes wrong and we need to construct an exception object.

My point is that having any computations done inside a constructor is a bad practice and must be avoided because they are side effects and are not requested by the object owner.

What about performance during the re-use of name, you may ask. If we make an instance of EnglishName and then call name.first() five times, we'll end up with five calls to the String.split() method.

To solve that, we create another class, a composable decorator, which will help us solve this "re-use" problem:

public final class CachedName implements Name {
  private final Name origin;
  public CachedName(final Name name) {
    this.origin = name;
  }
  @Override
  @Cacheable(forever = true)
  public String first() {
    return this.origin.first();
  }
}

I'm using the Cacheable annotation from jcabi-aspects, but you can use any other caching tools available in Java (or other languages), like Guava Cache:

public final class CachedName implements Name {
  private final Cache<Long, String> cache =
    CacheBuilder.newBuilder().build();
  private final Name origin;
  public CachedName(final Name name) {
    this.origin = name;
  }
  @Override
  public String first() {
    return this.cache.get(
      1L,
      new Callable<String>() {
        @Override
        public String call() {
          return CachedName.this.origin.first();
        }
      }
    );
  }
}

But please don't make CachedName mutable and lazily loaded---it's an anti-pattern, which I've discussed before in Objects Should Be Immutable.

This is how our code will look now:

final Name name = new CachedName(
  new EnglishName(
    new NameInPostgreSQL(/*...*/)
  )
);

It's a very primitive example, but I hope you get the idea.

In this design, we're basically splitting the object into two parts. The first one knows how to get the first name from the English name. The second one knows how to cache the results of this calculation in memory. And now it's my decision, as a user of these classes, how exactly to use them. I will decide whether I need caching or not. This is what object composition is all about.

Let me reiterate that the only allowed statement inside a constructor is an assignment. If you need to put something else there, start thinking about refactoring---your class definitely needs a redesign.

© Yegor Bugayenko 2014–2018

How to Protect a Business Idea While Outsourcing

QR code

How to Protect a Business Idea While Outsourcing

When you hire a programmer or a team of programmers to implement your business idea, there is a significant risk of theft and accidental loss. They may implement your idea (or its elements) without you using their own resources. Also, they may disclose it to their friends, and those friends may disclose it even further, until it is eventually implemented by someone you don't even know. This happens a lot---and everywhere. I've been on both sides. Here is my experience and a few recommendations.

There Will Be Blood (2007) by Paul Thomas Anderson
There Will Be Blood (2007) by Paul Thomas Anderson

There are basically a few levels of protection you can obtain, and they are listed below, from the simplest and least secure to the most expensive and most secure.

Very often, a software team you outsource programming to is located offshore in a developing country where people care about laws much less than in, say, the United States; corruption makes it possible for them to get away with ignoring almost any violation. Let's not forget this.

We're not discussing here the risk of losing a product. This also happens very often---your programmers start working for you, you pay them, they show you something, and then things go south. You find yourself in conflict with them, and they don't deliver you anything; they ask for extra money instead. You end up with nothing or something that is broken and can't be put on the market. This is yet another risk, which I'll try to describe in another article soon.

In this article, we're talking specifically about a situation where your programmers are using your idea to implement something similar on their own. Let's say you want to create a new search engine that would be smarter and more accurate than Google. You disclose your algorithms to a group of talented programmers and they just implement it on their own. You already gave them a multi-billion-dollar idea; why would they work for you on the payroll if they can create their own startup? This is the question you, as a savvy entrepreneur, must be prepared to answer for yourself.

By the way, I'm not a lawyer; I'm speaking here from practical experience only.

Non-Disclosure Agreement (NDA)

The first and easiest step is a so-called NDA. Here is a very simple and useful example of one, from NOLO. You put the name of your programmers into the document and ask them to sign it. They won't object, and you get a piece of paper with a signature; what's next?

If I'm a programmer, the document basically states that whatever you disclose to me, I should keep in secret and never "use for my own benefit, publish, copy, or otherwise disclose to others." If you then give me your Google-killer idea and I create my own product using its key principles, what will be your next steps?

In a court of law in your country, you will have to prove that: 1) you disclosed your idea to me, 2) I used it to create the product, 3) I didn't know about this idea beforehand, and 4) I didn't invent it myself. Until all of these criteria are proven, I'm innocent and my product is online, working and attracting customers.

Will you be able to prove that you disclosed the idea to me? Probably, if you sent me some documents. If you discussed it with me over the phone, you won't prove anything.

Will you be able to prove that your exact idea was used in my product? How will it technically be possible if I don't disclose the source code?

Can you prove that I didn't know about something similar before meeting you? Or maybe I was thinking about it on my own. Or I read about it somewhere else.

What if I disclose your idea to my friends and they create a product? Will you be able to prove the fact of that disclosure?

There are many such questions. My point is that a signed NDA is a very weak protection. It's more like a lock on the bathroom door---anyone can break it with a kick from their leg, but for those with good manners, it's a sign that the restroom is occupied.

Non-Compete Clause (NCC)

The next level of protection is an agreement with a Non-Compete Clause, which explicitly prohibits me from engaging in the business that your idea is about. For example, it may sound like this: "The developer is not allowed to participate in any businesses related to online searching for five years."

Will I sign this agreement? It depends. But if I show any reluctance in signing it, you should think twice about my real intentions.

Will this NCC protect you if I disclose your idea to my friends and they implement it? No, it won't.

Also, I would recommend you put some explicit liability numbers into the agreement. For example, it may sound like "a minimum penalty for a proven breach of the non-compete clause is $50,000." In my experience, such explicit statements make contracts much more valuable and help prevent them from being violated.

Copyright is where the government starts to protect you, but you have to pay us for it (by "us" I mean all of us, the society). Well, at least in the United States. In the United Kingdom, it's free, for example.

In the U.S., you go to copyright.gov, click "register a copyright," fill in an online form, post the description of your business idea in a plain text file, pay $35, and you're done. In a few months, you will receive a confirmation that your "record" is registered.

What does it give you? In a court of law, you can claim that this idea came to your mind on that specific date. Everyone who later made something similar probably stole it from you, including me, your programmer.

Will you be able to prove that my product is actually based on your idea and steals it? Maybe.

It's not a very strong protection either, but I would recommend you do it together with an NDA and NCC.

Patent

A patent is the best you can get to protect your idea. A patent is basically a guarantee of safety that you buy from the government. To get that guarantee, you have to do three things: 1) explain what exactly will be protected, 2) prove that it doesn't belong to someone else, and 3) pay your dues regularly. It's very similar to what gangsters do when they "protect" you, but here we're dealing with intellectual property and there is only one "gangster" per country :) The concept is pretty much the same.

badge

First, you describe your idea in the format that patents are written. It is not difficult at all, but it would help if you read one of those "how to file a patent" books. I recommend Patent It Yourself: Your Step-by-Step Guide to Filing at the U.S. Patent Office by David Pressman and Thomas J. Tuytschaevers.

Your application will likely amount to about 10 pages and should take a few days of your time if you know your idea well. No need to hire any attorneys; that's a waste of money.

Then you should do some research to make sure something similar already exists---but not exactly the same. For example, you can find Google patents for searching algorithms and mention them in your patent in the list of references.

Finally, you pay $425 (if you're a small company) and submit it to the USPTO. There is also an option to file a "provisional" patent, but I would recommend you not do this, as it's just an extra hassle. Simply file a normal one.

Once you've paid, your protection starts immediately. If I create a product that uses the idea described in your patent, you can bring me to court and ask me to share my profit with you. You will claim that I was making money by using your brilliant idea, and now it's time to share that success. Legally speaking, you will accuse me of patent infringement.

First, I will try to invalidate your patent, claiming that something similar already existed before you filed a patent, called prior art. If I succeed, the USPTO will invalidate your patent without a refund, and I'll walk away, paying you nothing.

If I fail, I'll try to prove that I didn't infringe on your patent. I will say that my product is not using your ideas but rather is designed with something else in mind, just like Samsung did. Maybe I'll win, but my chances will be low.

By the way, in a few years, you will receive a patent from the USPTO and put it on the shelf. You will then have to pay $480 more. Also, at the end of three years, you will have to pay $800 just to keep your guarantee alive. That escalates to $1,800 in seven years and $3,700 in 11 years (see the fee schedule). Told you; just like gangsters :)

To summarize, getting a patent is the best instrument available at the moment in developed countries that can protect your business idea. However, as Apple vs. Samsung lawsuit demonstrates, it is not a 100 percent guarantee either.

© Yegor Bugayenko 2014–2018

How to Implement an Iterating Adapter

QR code

How to Implement an Iterating Adapter

  • comments

Iterator is one of the fundamental Java interfaces, introduced in Java 1.2. It is supposed to be very simple; however, in my experience, many Java developers don't understand how to implement a custom one, which should iterate a stream of data coming from some other source. In other words, it becomes an adapter of another source of data, in the form of an iterator. I hope this example will help.

Let's say we have an object of this class:

final class Data {
  byte[] read();
}

When we call read(), it returns a new array of bytes that were retrieved from somewhere. If there is nothing to retrieve, the array will be empty. Now, we want to create an adapter that would consume the bytes and let us iterate them:

final class FluentData implements Iterator<Byte> {
  boolean hasNext() { /* ... */ }
  Byte next() { /* ... */ }
  void remove()  { /* ... */ }
}

Here is how it should look (it is not thread-safe!):

final class FluentData implements Iterator<Byte> {
  private final Data data;
  private final Queue<Byte> buffer = new LinkedList<>();
  public FluentData(final Data dat) {
    this.data = dat;
  }
  public boolean hasNext() {
    if (this.buffer.isEmpty()) {
      for (final byte item : this.data.read()) {
        this.buffer.add(item);
      }
    }
    return !this.buffer.isEmpty();
  }
  public Byte next() {
    if (!this.hasNext()) {
      throw new NoSuchElementException("Nothing left");
    }
    return this.buffer.poll();
  }
  public void remove() {
    throw new UnsupportedOperationException("It is read-only");
  }
}

There is no way to make it thread-safe because the iterating process is outside the scope of the iterator. Even if we declare our methods as synchronized, this won't guarantee that two threads won't conflict when they both call hasNext() and next(). So don't bother with it and just document the iterator as not thread-safe, then let its users synchronize one level higher when necessary.

© Yegor Bugayenko 2014–2018

My Favorite Software Books

QR code

My Favorite Software Books

  • comments

There are plenty of books about software engineering, but only a few of them rank among my favorites. I read all of those that do over and over again, and I might just update this post in the future when I stumble upon something else that's decent.

Note that I tried to put the most important books at the top of the list.

badge

Object Thinking by David West. This is the best book I've read about object-oriented programming, and it totally changed my understanding of it. I would recommend you read it a few times. But before reading, try to forget everything you've heard about programming in the past. Try to start from scratch. Maybe it will work for you too :)

badge

PMP Exam Prep, Eighth Edition: Rita's Course in a Book for Passing the PMP Exam by Rita Mulcahy. This book is my favorite for project management. Even though it's about the PMI approach and PMBOK in particular, it is a must-read for everyone who is interested in management. Ignore the PMBOK specifics and focus on the philosophy of project management and the role of project manager in it.

badge

The Art of Software Testing by Glenford J. Myers et al. You can read my short review of this book here. The book perfectly explains the philosophy of testing and destroys many typical myths and stereotypes. No matter what your job description is, if you're working in the software industry, you should understand testing and its fundamental principles. This is the only book you need in order to get that understanding.

badge

Growing Object-Oriented Software, Guided by Tests by Steve Freeman and Nat Pryce. All you need to know about your unit testing is in this book. I'm fully aware that I didn't include famous software engineer Kent Beck's book in this list because I don't like it at all. You definitely should read it, just to know what's going on, but it won't help you write good tests. Read this one instead, and read it many times.

badge

Working Effectively With Legacy Code by Michael Feathers. This is awesome reading about modern software development, its pitfalls, and typical failures. Most of the code we're working on now is legacy (a.k.a. open source). I read this book as a novel.

badge

Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation by Jez Humble and David Farley. This is a perfect book about software delivery, continuous integration, testing, packaging, versioning, and many other techniques involved in programming. It's definitely a must-read for anyone who is serious about software engineering.

badge

XML in a Nutshell, Third Edition by Elliotte Rusty Harold and W. Scott Means. XML is my favorite standard. And I hated it before I read this book. I didn't understand all the strange prefixes, namespaces, XPath expressions, and schemes. Just this one book changed everything, and ever since reading it, I've used XML everywhere. It is very well written and easy to read. It's a must for everybody.

badge

Java Concurrency in Practice by Brian Goetz et al. This is a very practical book about Java multi-threading, and at the same time, it provides a lot of theoretical knowledge about concurrency in general. I highly recommend you read it at least once.

badge

Effective Modern C++: 42 Specific Ways to Improve Your Use of C++11 and C++14 by Scott Meyers. No matter what language you're using, this book is very interesting and very useful. It makes many important suggestions about better C++ coding. If you understand most of them, your Java/Ruby/Python/Scala coding skills will improve significantly.

badge

Code Complete: A Practical Handbook of Software Construction, Second Edition by Steve McConnell. Consider this the bible of clean coding. Read it a few times and use it as a reference manual in debates with your colleagues. It mentions the most terrible anti-patterns and worst practices you'll see in modern programming. To be a good programmer, you must know all of them.

badge

Software Estimation: Demystifying the Black Art by Steve McConnell. This one's an interesting read about software engineering and its most tricky part---estimations. At the least, read it to be aware of the problem and possible solutions.

badge

Writing Effective Use Cases by Alistair Cockburn. An old and very good book, you won't actually use anything from this in your real projects, but you will pick up the philosophy of use cases, which will redirect your mind in the right direction. Don't take this book as something practical; these use cases are hardly used anywhere today, but the idea of scoping functionality this way is absolutely right.

badge

Software Requirements, Third Edition by Karl Wiegers (author) and Joy Beatty. A superb book about requirements analysis, the first and most important activity in any software project. Even if you're not an analyst, this book is a must-read.

badge

Version Control With Git: Powerful Tools and Techniques for Collaborative Software Development by Jon Loeliger and Matthew McCullough. This title serves as a practical guide for Git, a version control system. Read it from cover to cover and you will save many hours of your time in the future. Git is a de-facto standard in version control, and every programmer must know its fundamental principles---not from a cheat sheet but from an original source.

badge

JavaScript: The Definitive Guide: Activate Your Web Pages by David Flanagan. JavaScript is a language of the modern Web, and this book explains it very well. No matter what kind of software you develop, you must know JavaScript. Don't read it as a practical guide (even though it's called a guide) but rather as food for thought. JavaScript offers a lot to learn for Java/Ruby/Python developers.

badge

CSS: The Definitive Guide by Eric A. Meyer. CSS is not just about colors and shadows, and it's not only for graphic designers. CSS is a key language of the modern Web. Every developer must know it, whether you're working with a back-end, front-end, or desktop application in C++.

Also, check my GoodReads profile.

© Yegor Bugayenko 2014–2018

Software Quality Award, 2015

QR code

Software Quality Award, 2015

  • comments
badge

I'm a big fan of rules and discipline in software development; as an example, see Are You a Hacker or a Designer?. Also, I'm a big fan of object-oriented programming in its purest form; for example, see Seven Virtues of a Good Object. I'm also a co-founder and the CEO of Zerocracy, a software development company through which I put my admiration of discipline and clean design into practice.

I want to encourage you to share my passion---not just by reading this blog but through making real open source software in a disciplined way. This award is for those who are brave enough to swim against the current and value quality above everything else.

Send me your own project for review and participate in the contest.

Rules:

  • One person can submit up to three projects.

  • Submissions are accepted until the September 1, 2015 closed.

  • Submissions must be sent via email to me@yegor256.com. All I need is your GitHub login and repository name; I will check the commit history to make sure you're the main contributor to the project.

  • I reserve the right to reject any submission without explanation.

  • All submissions will be published on this page (including rejected ones).

  • Results will be announced October 15 on this page and by email.

  • The best project will receive $4,096.

  • The best 8 projects will receive 1-year open source licenses to any JetBrains products (one license per project).

  • Final decisions will be made by me and are not negotiable (although I may invite other people to help me make the right decision).

badge

Each project must be:

  • Open source (in GitHub).

  • At least 5,000 lines of code.

  • At least one year old.

  • Object-oriented (that's the only thing I understand).

The best projects will feature (more about it):

  • Strict and visible principles of design.

  • Continuous delivery.

  • Traceability of changes.

  • Self-documented source code.

  • Strict rules of code formatting.

What doesn't matter:

  • Popularity. Even if nobody is using your product, it is still eligible for this award. I don't care about popularity; quality is the key.

  • Programming language. I believe that any language, used correctly, can be applied to design a high-quality product.

  • Buzz and trends. Even if your project is yet another parser of command line arguments, it's still eligible for the award. I don't care about your marketing position; quality is all.

By the way, if you want to sponsor this award and increase the bonus, email me.


158 projects submitted so far (in order of submission):


October, 4th: A few weeks ago I asked three guys, who work with me, to check every single project in this list and provide their feedback. I've received three plain text files from them. Here they are, combined into one, with almost no corrections: award-2015.txt (you can find your project there). Based on their opinions, I've decided to select the following 12 projects for closer review (in alphabetic order):

I'll review them soon. The winner will be announced on the 15th of October.

October, 5th: I received an email from the author of raphw/byte-buddy, asking me to reconsider my decision about this project. I took a quick look at why the project was filtered out and decided to include it into the list of finalists. BTW, if any of you think that your project was excluded by mistake, don't hesitate to email me.

October, 11th: I analyzed all 12 projects today. All of them are really good projects, that's why, in order to find the best one I was focusing on their sins, not virtues. Here is what I found, preliminary.

coala-analyzer/coala (14K Python LoC, 160K HoC)

  • None is used in many places (over 400 places I found), which technically is NULL, and it's a serious anti-pattern
  • There are global functions, for example get_language_tool_results and DictUtilities. It is definitely a bad idea in OOP.
  • Class Constants is a terrible idea.
  • Checking object types in runtime is a bad idea, e.g. ClangCountVectorCreator.py
  • What's wrong with cindex.py? There are almost 3200 lines of code, that's way too many.
  • Static analysis is not a mandatory step in the build/release pipeline. That's why, I believe, code formatting is not consistent and sometimes rather ugly. For example, pylint reports hundreds of issues. (update: scrutinizer is used, but I still believe that a local use of pylint would seriously improve the quality of code)
  • Some methods have documentation, others don't. I didn't understand the logic. Would be great to have all methods documented. Also, not all classes are documented.
  • Score: 5

checkstyle/checkstyle (83K Java LoC, 553K HoC)

  • There are many ER-ending classes, like SeverityLevelCounter, Filter, and AbstractLoader (for example), which are anti-patterns.
  • There is a whole bunch of utility classes, which are definitely a bad thing in OOP. They are even grouped into a special utils package, such a terrible idea.
  • Setters and getters are everywhere, together with immutable classes, which really are not an OOP thing, for example DetectorOptions.
  • NULL is actively used, in many places---it's a serious anti-pattern
  • I've found five .java files with over 1000 lines in each of them, for example 2500+ in ParseTreeBuilder.java
  • There are direct commits to master made by different contributors and some of them are not linked back to any tickets. It's impossible to understand why they were made. Look at this for example: 7c50922. Was there a discussion involved? Who made a decision? Not clear at all.
  • Releases are not documented at all.
  • Release procedure is not automated. At least I didn't find any release script in the repository.
  • Score: 3

citiususc/hipster (5K Java LoC, 64K HoC)

  • Getters and setters are used in many places, for example in DepthFirstSearch and Transition. Aside from that, almost all classes are mutable and are used as transfer bags for the needs of the algorithm. This is not how OOP should be used, I believe.
  • There are public static methods and even utility classes, for example this one, with a funny name F
  • NULL is used actively, especially in iterators---it's a bad idea
  • JavaDoc documentation is not consistent, some methods are documented, others aren't.
  • Not all commits are linked to tickets, look at this, for example: 8cfa5de.
  • Changes are committed directly to master branch, pull requests are not used at all.
  • I didn't find an automated procedure for release. I found one for regular snapshot deployment to Bintray, but what about releases? Are they done manually?
  • There is no static analysis, that's why the code looks messy sometimes.
  • The amount of unit tests is rather small. Besides that, I didn't find a real code coverage report published anywhere.
  • Score: 4

gulpjs/gulp (700 JS LoC)

  • This project is too small for the competition, just 700 lines of code. Disqualified.
  • Score: 0

kaitoy/pcap4j (42K LoC, 122K HoC)

  • There is a util package with utility classes, which are a bad practice
  • NULL is used in mutable objects, for example in AbstractPcapAddress; it's a bad idea
  • There too many static methods and variables. They are literally everywhere. There is even a module called pcap4j-packetfactory-static, full of "classes" with static methods.
  • JavaDoc documentation is not consistent and sometimes just incomplete, check this, for example
  • There are just a few issues and only six pull requests. Commits are not linked to issues. There is almost zero traceability of changes.
  • Release procedure is not automated, releases are not documented
  • There is no static analysis, that's why the code looks messy sometimes
  • Score: 3

raphw/byte-buddy (84K LoC, 503K HoC)

  • I found over 20 .java files with over 1000 lines of code. TypePool.java even has 6200 lines!
  • There are many public static methods and properties. I realize that maybe that the only way to deal with the problem domain in Java, but still...
  • instanceof is used very often, and it's a bad practice in OOP. Again, I understand that problem domain may require it sometimes, but still...
  • Most commits are made directly to master, without pull requests or tickets, that's why traceability of them is broken.
  • Release procedure is not automated (I didn't find a script).
  • Score: 5

subchen/snack-string (1K LoC, 2K HoC)

  • The project is too small, disqualified.
  • Score: 0

gvlasov/inflectible (5K LoC, 36K HoC)

  • The project is rather small, right on the edge of competition requirements and is made by a single developer. Besides that I don't see any problems here. The code looks object oriented, all changes are traceable back to issues and pull requests, release procedure is automated, static analysis is mandatory, releases are documented. Thumbs up!
  • Score: 10

testinfected/molecule (10K LoC, 43K HoC)

  • There are a few utility classes, for example Streams
  • There are setters and getters in some classes, even through they are in a different naming convention, for example Request and Response.
  • Most of .java files don't have any JavDoc blocks, and they look consistent, but then, all of a sudden, some files do have documentation, for example WebServer.
  • There are not so many issues and most of commits are not traceable back to any of them, for example b4143a0---why it was made? Not clear. Also, there are almost no pull request. Looks like the author is just committing to master.
  • Release procedure is not documented/automated. I didn't find it. Also, releases are not documented at all.
  • Static analysis is absent.
  • Score: 6

trautonen/coveralls-maven-plugin (4.5K LoC)

  • The project is too small for the competition, less than 5K lines of code. Besides that, it's younger than one year, the first commit was made in May 2015. Disqualified.
  • Score: 0

wbotelhos/raty (8.7K LoC, 63K HoC)

  • There are utility classes, for example helper.js
  • There are global functions, for example in helper.js
  • jasmine.js has 2400 lines of code, which is way too many
  • I didn't understand why .html files stay together with .js in the same directory, for example run.html
  • Not all changes are traceable to issues, for example 0a233e8. There are not so many issues in the project and just a few pull requests.
  • Release procedure is not automated (at least I didn't find any documentation about it)
  • There is no static analysis
  • There are no unit tests
  • Score: 2

xvik/guice-persist-orient (17K LoC, 54K HoC)

  • There is ORM and that's why getters and setters, for example in VersionedEntity
  • Dependency injection is actively used, which, I believe, is a bad idea in OOP in general. But in this project, I understand, it is required by the problem domain. Anyway...
  • There are just a few issues and almost no pull requests, and commits are not traceable back to issues, for example this one: e9c8f79
  • ~~There is no static analysis~~ (static analysis is there, with a subset of Checkstyle, PMD and FindBugs checks)
  • Score: 5

I paid most attention to anti-patterns, which is the first and the most terrible sin we should try to avoid. Presence of null, for example, much more seriously affected the score than the absence of an automated release procedure.

Oct 15: Thus, we have these best projects, out of 158 submitted to the competition:

  1. gvlasov/inflectible: winner!
  2. testinfected/molecule
  3. coala-analyzer/coala
  4. xvik/guice-persist-orient
  5. raphw/byte-buddy
  6. citiususc/hipster
  7. checkstyle/checkstyle
  8. kaitoy/pcap4j

Congratulations to @gvlasov, the winner! Here is your badge:

winner

Put this code into GitHub README:

<a href="http://www.yegor256.com/2015/04/16/award.html">
  <img src="//img.teamed.io/award/2015/winner.png"
  style="width:203px;height:45px;" alt='winner'/></a>

All eight projects will receive a one-user free one-year license for one JetBrains product. I will email you all and we'll figure out how to transfer them.

Thanks to everybody for participation! See you next year.

© Yegor Bugayenko 2014–2018

Tacit, a CSS Framework Without Classes

QR code

Tacit, a CSS Framework Without Classes

  • comments

I've been using Bootstrap for more than two years in multiple projects, and my frustration has been building. First of all, it's too massive for a small web app. Second, it is not fully self-sufficient; no matter how much you follow its principles of design, you end up with your own CSS styles anyway. Third, and most importantly, its internal design is messy. Having all this in mind, I created tacit, my own CSS framework, which immediately received positive feedback on Hacker News.

Tacit, according to Google, means "understood or implied without being stated." That's exactly the idea of the framework. It doesn't have a single CSS class and can be applied to any valid HTML5 document. For example, you have an HTML document:

<!DOCTYPE html>
<html>
  <head>
    <title>Subscribe</title>
  </head>
  <body>
    <section>
      <p>Are you interested in learning more?</p>
      <form>
        <label>Email:</label>
        <input name="email"/>
        <button type="submit">Subscribe</button>
      </form>
    </section>
  </body>
</html>

This is how it looks in Safari:

The figure

Now, I add tacit.min.css to it:

<!DOCTYPE html>
<html>
  <head>
    <title>Subscribe</title>
    <link rel="stylesheet" type="text/css"
      href="http://yegor256.github.io/tacit/tacit.min.css"/>
  </head>
  <body>
    <section>
      <p>Are you interested in learning more?</p>
      <form>
        <label>Email:</label>
        <input name="email" type="text"/>
        <button type="submit">Subscribe</button>
      </form>
    </section>
  </body>
</html>

This is how it looks in the same Safari browser:

The figure

I hope you got the idea. The HTML itself wasn't changed at all. All CSS styles are applied to standard HTML elements. Unlike many other CSS frameworks, in Tacit you don't have to mention CSS classes in the HTML document. The HTML stays clean and only exposes the data in a pure HTML5 way.

The HTML document is still readable and usable, but it doesn't have the good-looking-graphics component. Tacit adds that component in a non-intrusive manner.

Of course, in many projects, the default layout features of Tacit won't be enough. In most cases, I still have to add my own CSS classes and inline styles. But Tacit gives me an adequate foundation to start from. It solves most of the problems associated with responsiveness of forms, appearance of form controls, tables, fonts, and colors.

Tacit allows me to focus on functionality from the first day of a project. And the functionality immediately looks attractive. I have tried many other frameworks, including Bootstrap, Kube, and Pure. None of them are designed with this concept in mind. They all put CSS in front of HTML. In all of them, CSS is the most important element of web design, while HTML is something that assists.

Tacit takes a different approach. In Tacit, HTML is king while CSS is a supportive element that only makes data look better.

Enjoy :)

© Yegor Bugayenko 2014–2018

Class Casting Is a Discriminating Anti-Pattern

QR code

Class Casting Is a Discriminating Anti-Pattern

Type casting is a very useful technique when there is no time or desire to think and design objects properly. Type casting (or class casting) helps us work with provided objects differently, based on the class they belong to or the interface they implement. Class casting helps us discriminate against the poor objects and segregate them by their race, gender, and religion. Can this be a good practice?

Гадкий утенок (1956) by Владимир Дегтярёв
Гадкий утенок (1956) by Владимир Дегтярёв

This is a very typical example of type casting (Google Guava is full of it, for example Iterables.size()):

public final class Foo {
  public int sizeOf(Iterable items) {
    int size = 0;
    if (items instanceof Collection) {
      size = Collection.class.cast(items).size();
    } else {
      for (Object item : items) {
        ++size;
      }
    }
    return size;
  }
}

This sizeOf() method calculates the size of an iterable. However, it is smart enough to understand that if items are also instances of Collection, there is no need to actually iterate them. It would be much faster to cast them to Collection and then call method size(). Looks logical, but what's wrong with this approach? I see two practical problems.

First, there is a hidden coupling of sizeOf() and Collection. This coupling is not visible to the clients of sizeOf(). They don't know that method sizeOf() relies on interface Collection. If tomorrow we decide to change it, sizeOf() won't work. And we'll be very surprised, since its signature says nothing about this dependency. This won't happen with Collection, obviously, since it is part of the Java SDK, but with custom classes, this may and will happen.

The second problem is an inevitably growing complexity of method sizeOf(). The more special types it has to treat differently, the more complex it will become. This if/then forking is inevitable, since it has to check all possible types and give them special treatment. Such complexity is a result of a violation of the single responsibility principle. The method is not only calculating the size of Iterable but is also performing type casting and forking based on that casting.

What is the alternative? There are a few, but the most obvious is method overloading (not available in semi-OOP languages like Ruby or PHP):

public final class Foo {
  public int sizeOf(Iterable items) {
    int size = 0;
    for (Object item : items) {
      ++size;
    }
    return size;
  }
  public int sizeOf(Collection items) {
    return items.size();
  }
}

Isn't that more elegant?

Philosophically speaking, type casting is discrimination against the object that comes into the method. The object complies with the contract provided by the method signature. It implements the Iterable interface, which is a contract, and it expects equal treatment with all other objects that come into the same method. But the method discriminates objects by their types. The method is basically asking the object about its... race. Black objects go right while white objects go left. That's what this instanceof is doing, and that's what discrimination is all about.

By using instanceof, the method is segregating incoming objects by the certain group they belong to. In this case, there are two groups: collections and everybody else. If you are a collection, you get special treatment. Even though you abide by the Iterable contract, we still treat some objects specially because they belong to an "elite" group called Collection.

You may say that Collection is just another contract that an object may comply with. That's true, but in this case, there should be another door through which those who work by that contract should enter. You announced that sizeOf() accepts everybody who works on the Iterable contract. I am an object, and I do what the contract says. I enter the method and expect equal treatment with everybody else who comes into the same method. But, apparently, once inside the method, I realize that some objects have some special privileges. Isn't that discrimination?

To conclude, I would consider instanceof and class casting to be anti-patterns and code smells. Once you see a need to use them, start thinking about refactoring.

© Yegor Bugayenko 2014–2018

How AppVeyor Helps Me to Validate Pull Requests Before Rultor Merges Them

QR code

How AppVeyor Helps Me to Validate Pull Requests Before Rultor Merges Them

  • comments
badge

AppVeyor is a great cloud continuous integration service that builds Windows projects. Rultor is a DevOps assistant, which automates release, merge and deploy operations, using Docker containers. These posts explain how Rultor works and what it's for: Rultor.com, a Merging Bot and Master Branch Must Be Read-Only.

badge

The problem is that Rultor is running all scripts inside Docker containers and Docker can't build Windows projects. The only and the best logical solution is to trigger AppVeyor before running all other scripts in Docker. If AppVeyor gives a green light, we continue with our usual in-Docker script. Otherwise, we fail the entire build. Below I explain how this automation was configured in Takes framework.

First, I got a token from my AppVeyor account (at the time of writing it was here). I created a text file curl-appveyor.cfg with this content (it's not my real token inside, just an example):

--silent
--header "Authorization: Bearer 1hdmsfbs7xccb9x6g1y4"
--header "Content-Type: application/json"
--header "Accept: application/json"

Then, I encrypted this file, using rultor command line tool:

$ rultor encrypt -p yegor256/takes curl-appveyor.cfg

The file I created was called curl-appveyor.cfg.asc. I committed and pushed into yegor256/takes GitHub repository.

$ git add curl-appveyor.cfg.asc
$ git commit -am 'CURL config for Appveyor'
$ git push origin master

Then, I configured AppVeyor "pinging" from Docker script. This is what I did in .rultor.yml:

decrypt:
  curl-appveyor.cfg: "repo/curl-appveyor.cfg.asc"
merge:
  script: |-
    ver=$(curl -K ../curl-appveyor.cfg \
      --data "{accountName: 'yegor256',
        projectSlug: 'takes',
        pullRequestId: '${pull_id}'}" \
      https://ci.appveyor.com/api/builds | jq -r '.version')
    while true; do
      status=$(curl -K ../curl-appveyor.cfg \
        https://ci.appveyor.com/api/projects/yegor256/takes/build/${ver} \
        | jq -r '.build.status')
      if [ "${status}" == "success" ]; then break; fi
      if [ "${status}" == "failed" ]; then
        echo "see https://ci.appveyor.com/project/yegor256/takes/build/${ver}"
        exit 1
      fi
      echo "waiting for AppVeyor build ${ver}: ${status}"
      sleep 5s
    done
    mvn clean install

There is no magic here, it's very simple. First, I start a new build using /api/builds end-point of AppVeyor REST API. ${pull_id} is an environment variable that is coming from Rultor, it contains an integer number of current pull request.

I'm using jq in order to parse AppVeyor JSON output.

Once the build is started, I'm getting its unique version and start looping to check its status. I'm waiting for success or failed. Anything else will mean that the build is still in progress and I should keep looping.

You can see how it works in this pull request, for example: yegor256/takes#93.

© Yegor Bugayenko 2014–2018

JAXB Is Doing It Wrong; Try Xembly

QR code

JAXB Is Doing It Wrong; Try Xembly

  • comments
badge

JAXB is a 10-year-old Java technology that allows us to convert a Java object into an XML document (marshalling) and back (unmarshalling). This technology is based on setters and getters and, in my opinion, violates key principles of object-oriented programming by turning objects into passive data structures. I would recommend you use Xembly instead for marshalling Java objects into XML documents.

This is how JAXB marshalling works. Say you have a Book class that needs to be marshalled into an XML document. You have to create getters and annotate them:

import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement
public class Book {
  private final String isbn;
  private final String title;
  public Book(final String isbn, final String title) {
    this.isbn = isbn;
    this.title = title;
  }
  @XmlElement
  public String getIsbn() {
    return this.isbn;
  }
  @XmlElement
  public String getTitle() {
    return this.title;
  }
}

Then you create a marshaller and ask it to convert an instance of class Book into XML:

final Book book = new Book("0132350882", "Clean Code");
final JAXBContext context = JAXBContext.newInstance(Book.class);
final Marshaller marshaller = jaxbContext.createMarshaller();
marshaller.marshal(book, System.out);

You should be expecting something like this in the output:

<?xml version="1.0"?>
<book>
  <isbn>0132350882</isbn>
  <title>Clean Code</title>
</book>

So what's wrong with it? Pretty much the same thing that's wrong with object-relational mapping, which is explained in ORM Is an Offensive Anti-Pattern. JAXB is treating an object as a bag of data, extracting the data and converting it into XML the way JAXB wants. The object has no control over this process. Therefore an object is not an object anymore but rather a passive bag of data.

An ideal approach would be to redesign our class Book this way:

public class Book {
  private final String isbn;
  private final String title;
  public Book(final String isbn, final String title) {
    this.isbn = isbn;
    this.title = title;
  }
  public String toXML() {
    // create XML document and return
  }
}

However, there are a few problems with this approach. First of all, there's massive code duplication. Building an XML document is a rather verbose process in Java. If every class had to re-implement it in its toXML() method, we would have a big problem with duplicate code.

The second problem is that we don't know exactly what type of wrapping our XML document should be delivered in. It may be a String or an InputStream or maybe an instance of org.w3c.dom.Document. Making many toXML() methods in each object would definitely be a disaster.

Xembly provides a solution. As I've mentioned before, it is an imperative language for XML constructions and manipulations. Here is how we can implement our Book object with the help of Xembly:

import org.xembly.Directive;
public class Book {
  private final String isbn;
  private final String title;
  public Book(final String isbn, final String title) {
    this.isbn = isbn;
    this.title = title;
  }
  public Iterable<Directive> toXembly() {
    return new Directives()
      .add("book")
      .add("isbn").set(this.isbn).up()
      .add("title").set(this.title).up()
      .up();
  }
}

Now, in order to build an XML document, we should use this code outside the object:

final Book book = new Book("0132350882", "Clean Code");
final String xml = new Xembler(book.toXembly()).xml();

This Xembler class will convert Xembly directives into an XML document.

The beauty of this solution is that the internals of the object are not exposed via getters and the object is fully in charge of the XML marshalling process. In addition, the complexity of these directives may be very high---much higher than the rather cumbersome annotations of JAXB.

Xembly is an open source project, so feel free to submit your questions or corrections to GitHub.

© Yegor Bugayenko 2014–2018

Java Web App Architecture In Takes Framework

QR code

Java Web App Architecture In Takes Framework

  • comments

I used to utilize Servlets, JSP, JAX-RS, Spring Framework, Play Framework, JSF with Facelets, and a bit of Spark Framework. All of these solutions, in my humble opinion, are very far from being object-oriented and elegant. They all are full of static methods, un-testable data structures, and dirty hacks. So about a month ago, I decided to create my own Java web framework. I put a few basic principles into its foundation: 1) No NULLs, 2) no public static methods, 3) no mutable classes, and 4) no class casting, reflection, and instanceof operators. These four basic principles should guarantee clean code and transparent architecture. That's how the Takes framework was born. Let's see what was created and how it works.

Making of The Godfather (1972) by Francis Ford Coppola
Making of The Godfather (1972) by Francis Ford Coppola

Java Web Architecture in a Nutshell

This is how I understand a web application architecture and its components, in simple terms.

First, to create a web server, we should create a new network socket, that accepts connections on a certain TCP port. Usually it is 80, but I'm going to use 8080 for testing purposes. This is done in Java with the ServerSocket class:

import java.net.ServerSocket;
public class Foo {
  public static void main(final String... args) throws Exception {
    final ServerSocket server = new ServerSocket(8080);
    while (true);
  }
}

That's enough to start a web server. Now, the socket is ready and listening on port 8080. When someone opens http://localhost:8080 in their browser, the connection will be established and the browser will spin its waiting wheel forever. Compile this snippet and try. We just built a simple web server without the use of any frameworks. We're not doing anything with incoming connections yet, but we're not rejecting them either. All of them are being lined up inside that server object. It's being done in a background thread; that's why we need to put that while(true) in afterward. Without this endless pause, the app will finish its execution immediately and the server socket will shut down.

The next step is to accept the incoming connections. In Java, that's done through a blocking call to the accept() method:

final Socket socket = server.accept();

The method is blocking its thread and waiting until a new connection arrives. As soon as that happens, it returns an instance of Socket. In order to accept the next connection, we should call accept() again. So basically, our web server should work like this:

public class Foo {
  public static void main(final String... args) throws Exception {
    final ServerSocket server = new ServerSocket(8080);
    while (true) {
      final Socket socket = server.accept();
      // 1. Read HTTP request from the socket
      // 2. Prepare an HTTP response
      // 3. Send HTTP response to the socket
      // 4. Close the socket
    }
  }
}

It's an endless cycle that accepts a new connection, understands it, creates a response, returns the response, and accepts a new connection again. HTTP protocol is stateless, which means the server should not remember what happened in any previous connection. All it cares about is the incoming HTTP request in this particular connection.

The HTTP request is coming from the input stream of the socket and looks like a multi-line block of text. This is what you would see if you read an input stream of the socket:

final BufferedReader reader = new BufferedReader(
  new InputStreamReader(socket.getInputStream())
);
while (true) {
  final String line = reader.readLine();
  if (line.isEmpty()) {
    break;
  }
  System.out.println(line);
}

You will see something like this:

GET / HTTP/1.1
Host: localhost:8080
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,ru;q=0.6,uk;q=0.4

The client (the Google Chrome browser, for example) passes this text into the connection established. It connects to port 8080 at localhost, and as soon as the connection is ready, it immediately sends this text into it, then waits for a response.

Our job is to create an HTTP response using the information we get in the request. If our server is very primitive, we can basically ignore all the information in the request and just return "Hello, world!" to all requests (I'm using IOUtils for simplicity):

import java.net.Socket;
import java.net.ServerSocket;
import org.apache.commons.io.IOUtils;
public class Foo {
  public static void main(final String... args) throws Exception {
    final ServerSocket server = new ServerSocket(8080);
    while (true) {
      try (final Socket socket = server.accept()) {
        IOUtils.copy(
          IOUtils.toInputStream("HTTP/1.1 200 OK\r\n\r\nHello, world!"),
          socket.getOutputStream()
        );
      }
    }
  }
}

That's it. The server is ready. Try to compile and run it. Point your browser to http://localhost:8080, and you will see Hello, world!:

$ javac -cp commons-io.jar Foo.java
$ java -cp commons-io.jar:. Foo &
$ curl http://localhost:8080 -v
* Rebuilt URL to: http://localhost:8080/
* Connected to localhost (::1) port 8080 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.37.1
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
* no chunk, no close, no size. Assume close to signal end
<
* Closing connection 0
Hello, world!

That's all you need to build a web server. Now let's discuss how to make it object-oriented and composable. Let's try to see how the Takes framework was built.

Routing/Dispatching

Routing/dispatching is combined with response printing in Takes. All you need to do to create a working web application is to create a single class that implements Take interface:

import org.takes.Request;
import org.takes.Take;
public final class TkFoo implements Take {
  @Override
  public Response route(final Request request) {
    return new RsText("Hello, world!");
  }
}

And now it's time to start a server:

import org.takes.http.Exit;
import org.takes.http.FtBasic;
public class Foo {
  public static void main(final String... args) throws Exception {
    new FtBasic(new TkFoo(), 8080).start(Exit.NEVER);
  }
}

This FtBasic class does the exact same socket manipulations explained above. It starts a server socket on port 8080 and dispatches all incoming connections through an instance of TkFoo that we are giving to its constructor. It does this dispatching in an endless cycle, checking every second whether it's time to stop with an instance of Exit. Obviously, Exit.NEVER always responds with, "Don't stop, please."

HTTP Request

Now let's see what's inside the HTTP request arriving at TkFoo and what we can get out of it. This is how the Request interface is defined in Takes:

public interface Request {
  Iterable<String> head() throws IOException;
  InputStream body() throws IOException;
}

The request is divided into two parts: the head and the body. The head contains all lines that go before the empty line that starts a body, according to HTTP specification in RFC 2616. There are many useful decorators for Request in the framework. For example, RqMethod will help you get the method name from the first line of the header:

final String method = new RqMethod(request).method();

RqHref will help extract the query part and parse it. For example, this is the request:

GET /user?id=123 HTTP/1.1
Host: www.example.com

This code will extract that 123:

final int id = Integer.parseInt(
  new RqHref(request).href().param("id").get(0)
);

RqPrint can get the entire request or its body printed as a String:

final String body = new RqPrint(request).printBody();

The idea here is to keep the Request interface simple and provide this request parsing functionality to its decorators. This approach helps the framework keep classes small and cohesive. Each decorator is very small and solid, doing exactly one thing. All of these decorators are in the org.takes.rq package. As you already probably understand, the Rq prefix stands for Request.

First Real Web App

Let's create our first real web application, which will do something useful. I would recommend starting with an Entry class, which is required by Java to start an app from the command line:

import org.takes.http.Exit;
import org.takes.http.FtCli;
public final class Entry {
  public static void main(final String... args) throws Exception {
    new FtCli(new TkApp(), args).start(Exit.NEVER);
  }
}

This class contains just a single main() static method that will be called by JVM when the app starts from the command line. As you see, it instantiates FtCli, giving it an instance of class TkApp and command line arguments. We'll create the TkApp class in a second. FtCli (translates to "front-end with command line interface") makes an instance of the same FtBasic, wrapping it into a few useful decorators and configuring it according to command line arguments. For example, --port=8080 will be converted into a 8080 port number and passed as a second argument of the FtBasic constructor.

The web application itself is called TkApp and extends TsWrap:

import org.takes.Take;
import org.takes.facets.fork.FkRegex;
import org.takes.facets.fork.TkFork;
import org.takes.tk.TkWrap;
import org.takes.tk.TkClasspath;
final class TkApp extends TkWrap {
  TkApp() {
    super(TkApp.make());
  }
  private static Take make() {
    return new TkFork(
      new FkRegex("/robots.txt", ""),
      new FkRegex("/css/.*", new TkClasspath()),
      new FkRegex("/", new TkIndex())
    );
  }
}

We'll discuss this TkFork class in a minute.

If you're using Maven, this is the pom.xml you should start with:

<?xml version="1.0"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
    http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>foo</groupId>
  <artifactId>foo</artifactId>
  <version>1.0-SNAPSHOT</version>
  <dependencies>
    <dependency>
      <groupId>org.takes</groupId>
      <artifactId>takes</artifactId>
      <version>0.9</version> <!-- check the latest in Maven Central -->
    </dependency>
  </dependencies>
  <build>
    <finalName>foo</finalName>
    <plugins>
      <plugin>
        <artifactId>maven-dependency-plugin</artifactId>
        <executions>
          <execution>
            <goals>
              <goal>copy-dependencies</goal>
            </goals>
            <configuration>
              <outputDirectory>${project.build.directory}/deps</outputDirectory>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Running mvn clean package should build a foo.jar file in target directory and a collection of all JAR dependencies in target/deps. Now you can run the app from the command line:

$ mvn clean package
$ java -Dfile.encoding=UTF-8 \
  -cp ./target/foo.jar:./target/deps/* foo.Entry --port=8080

The application is ready, and you can deploy it to, say, Heroku. Just create a Procfile file in the root of the repository and push the repo to Heroku. This is what Procfile should look like:

web: java -Dfile.encoding=UTF-8 \
  -cp target/foo.jar:target/deps/* \
  foo.Entry --port=${PORT}

TkFork

This TkFork class seems to be one of the core elements of the framework. It helps route an incoming HTTP request to the right take. Its logic is very simple, and there are just a few lines of code inside it. It encapsulates a collection of "forks," which are instances of the Fork interface:

public interface Fork {
  Iterator<Response> route(Request req) throws IOException;
}

Its only route() method either returns an empty iterator or an iterator with a single Response. TkFork goes through all forks, calling their route() methods until one of them returns a response. Once that happens, TkFork returns this response to the caller, which is FtBasic.

Let's create a simple fork ourselves now. For example, we want to show the status of the application when the /status URL is requested. Here is the code:

final class TkApp extends TkWrap {
  private static Take make() {
    return new TkFork(
      new Fork() {
        @Override
        public Iterator<Response> route(Request req) {
          final Collection<Response> responses = new ArrayList<>(1);
          if (new RqHref(req).href().path().equals("/status")) {
            responses.add(new TkStatus());
          }
          return responses.iterator();
        }
      }
    );
  }
}

I believe the logic here is clear. We either return an empty iterator or an iterator with an instance of TkStatus inside. If an empty iterator is returned, TkFork will try to find another fork in the collection that actually gets an instance of Response. By the way, if nothing is found and all forks return empty iterators, TkFork will throw a "Page not found" exception.

This exact logic is implemented by an out-of-the-box fork called FkRegex, which attempts to match a request URI path with the regular expression provided:

final class TkApp extends TkWrap {
  private static Take make() {
    return new TkFork(
      new FkRegex("/status", new TkStatus())
    );
  }
}

We can compose a multi-level structure of TkFork classes; for example:

final class TkApp extends TsWrap {
  private static Take make() {
    return new TkFork(
      new FkRegex(
        "/status",
        new TkFork(
          new FkParams("f", "json", new TkStatusJSON()),
          new FkParams("f", "xml", new TkStatusXML())
        )
      )
    );
  }
}

Again, I believe it's obvious. The instance of FkRegex will ask an encapsulated instance of TkFork to return a response, and it will try to fetch it from one that FkParams encapsulated. If the HTTP query is /status?f=xml, an instance of TkStatusXML will be returned.

HTTP Response

Now let's discuss the structure of the HTTP response and its object-oriented abstraction, Response. This is how the interface looks:

public interface Response {
  Iterable<String> head() throws IOException;
  InputStream body() throws IOException;
}

Looks very similar to the Request, doesn't it? Well, it's identical, mostly because the structure of the HTTP request and response is almost identical. The only difference is the first line.

There is a collection of useful decorators that help in response building. They are composable, which makes them very convenient. For example, if you want to build a response that contains an HTML page, you compose them like this:

final class TkIndex implements Take {
  @Override
  public Response act() {
    return new RsWithStatus(
      new RsWithType(
        new RsWithBody("<html>Hello, world!</html>"),
        "text/html"
      ),
      200
    );
  }
}

In this example, the decorator RsWithBody creates a response with a body but with no headers at all. Then, RsWithType adds the header Content-Type: text/html to it. Then, RsWithStatus makes sure the first line of the response contains HTTP/1.1 200 OK.

You can create your own decorators that can reuse existing ones. Take a look at how it's done in RsPage from rultor.com.

How About Templates?

Returning simple "Hello, world" pages is not a big problem, as we can see. But what about more complex output like HTML pages, XML documents, JSON data sets, etc? There are a few convenient Response decorators that enable all of that. Let's start with Velocity, a simple templating engine. Well, it's not that simple. It's rather powerful, but I would suggest to use it in simple situations only. Here is how it works:

final class TkIndex implements Take {
  @Override
  public Response act() {
    return new RsVelocity("Hello, ${name}")
      .with("name", "Jeffrey");
  }
}

The RsVelocity constructor accepts a single argument that has to be a Velocity template. Then, you call the with() method, injecting data into the Velocity context. When it's time to render the HTTP response, RsVelocity will "evaluate" the template against the context configured. Again, I would recommend you use this templating approach only for simple outputs.

For more complex HTML documents, I would recommend you use XML/XSLT in combination with Xembly. I explained this idea in a few previous posts: XML+XSLT in a Browser and RESTful API and a Web Site in the Same URL. It is simple and powerful---Java generates XML output and the XSLT processor transforms it into HTML documents. This is how we separate representation from data. The XSL stylesheet is a "view" and TkIndex is a "controller," in terms of MVC.

I'll write a separate article about templating with Xembly and XSL very soon.

In the meantime, we'll create decorators for JSF/Facelets and JSP rendering in Takes. If you're interested in helping, please fork the framework and submit your pull requests.

What About Persistence?

Now, a question that comes up is what to do with persistent entities, like databases, in-memory structures, network connections, etc. My suggestion is to initialize them inside the Entry class and pass them as arguments into the TkApp constructor. Then, the TkApp will pass them into the constructors of custom takes.

For example, we have a PostgreSQL database that contains some table data that we need to render. Here is how I would initialize a connection to it in the Entry class (I'm using a BoneCP connection pool):

public final class Entry {
  public static void main(final String... args) throws Exception {
    new FtCli(new TkApp(Entry.postgres()), args).start(Exit.NEVER);
  }
  private static Source postgres() {
    final BoneCPDataSource src = new BoneCPDataSource();
    src.setDriverClass("org.postgresql.Driver");
    src.setJdbcUrl("jdbc:postgresql://localhost/db");
    src.setUser("root");
    src.setPassword("super-secret-password");
    return src;
  }
}

Now, the constructor of TkApp must accept a single argument of type java.sql.Source:

final class TkApp extends TkWrap {
  TkApp(final Source source) {
    super(TkApp.make(source));
  }
  private static Take make(final Source source) {
    return new TkFork(
      new FkRegex("/", new TkIndex(source))
    );
  }
}

Class TkIndex also accepts a single argument of class Source. I believe you know what to do with it inside TkIndex in order to fetch the SQL table data and convert it into HTML. The point here is that the dependency must be injected into the application (instance of class TkApp) at the moment of its instantiation. This is a pure and clean dependency injection mechanism, which is absolutely container-free. Read more about it in Dependency Injection Containers Are Code Polluters.

Unit Testing

Since every class is immutable and all dependencies are injected only through constructors, unit testing is extremely easy. Let's say we want to test TkStatus, which is supposed to return an HTML response (I'm using JUnit 4 and Hamcrest):

import org.junit.Test;
import org.hamcrest.MatcherAssert;
import org.hamcrest.Matchers;
public final class TkIndexTest {
  @Test
  public void returnsHtmlPage() throws Exception {
    MatcherAssert.assertThat(
      new RsPrint(
        new TkStatus().act(new RqFake())
      ).printBody(),
      Matchers.equalsTo("<html>Hello, world!</html>")
    );
  }
}

Also, we can start the entire application or any individual take in a test HTTP server and test its behavior via a real TCP socket; for example (I'm using jcabi-http to make an HTTP request and check the output):

public final class TkIndexTest {
  @Test
  public void returnsHtmlPage() throws Exception {
    new FtRemote(new TkIndex()).exec(
      new FtRemote.Script() {
        @Override
        public void exec(final URI home) throws IOException {
          new JdkRequest(home)
            .fetch()
            .as(RestResponse.class)
            .assertStatus(HttpURLConnection.HTTP_OK)
            .assertBody(Matchers.containsString("Hello, world!"));
        }
      }
    );
  }
}

FtRemote starts a test web server at a random TCP port and calls the exec() method at the provided instance of FtRemote.Script. The first argument of this method is a URI of the just-started web server homepage.

The architecture of Takes framework is very modular and composable. Any individual take can be tested as a standalone component, absolutely independent from the framework and other takes.

Why the Name?

That's the question I've been hearing rather often. The idea is simple, and it originates from the movie business. When a movie is made, the crew shoots many takes in order to capture the reality and put it on film. Each capture is called a take.

In other words, a take is like a snapshot of the reality.

The same applies to this framework. Each instance of Take represents a reality at one particular moment in time. This reality is then sent to the user in the form of a Response.

PS. There are a few words about authentication: How Cookie-Based Authentication Works in the Takes Framework.

PPS. There are a few real web systems, which you may be interested to take a look at. They all are using Takes Framework and their code is open: rultor.com, jare.io, wring.io.

© Yegor Bugayenko 2014–2018

Worst Technical Specifications Have No Glossaries

QR code

Worst Technical Specifications Have No Glossaries

  • comments

I read a few technical specifications every week from our current and potential clients, and there's one thing I can't take anymore; I have to write about it: 99 percent of the documents I'm reading don't have glossaries, and because of that, they are very difficult to read and digest. Even when they do have glossaries, their definitions of terms are very vague and ambiguous. Why is this happening? Don't we understand the importance of a common vocabulary for any software project? I'm not sure what the causes are, but this is what a software architect should do when he or she starts a project---create a glossary.

Pulp Fiction (1994) by Quentin Tarantino
Pulp Fiction (1994) by Quentin Tarantino

I'm trying to write something unique about this subject, but everything I can say is so obvious that I doubt anyone would be interested in reading it. Anyway, I will try.

A glossary (a.k.a. vocabulary) is a list of terms used by the project that is usually included at the beginning of the technical specification document. Ideally, every technical term used in the document should be briefly explained in the glossary. The existence of a glossary helps everyone who works with the document quickly understand each other and avoid misconceptions. On top of this, a detailed and accurate glossary saves a reader a lot of time.

So why are glossaries not written? I see a few possible causes (usually, they are combined):

We're Smarter Than This. Some people think glossaries are for newbies. After all, why would I explain what a PDU is? Any serious network engineer should understand that it stands for "protocol data unit." If you don't understand it, do your homework and then come back to work with us. Our team only works with well-educated engineers. You're supposed to understand what PDU, ADC, TxR, IPv6, DPI, FIFO, and USSR (joking!) stand for. Otherwise, you're not talented enough to be with us. Needless to say, this attitude can only come from those who have no idea what they are doing. A good engineer always remembers that if the receiver doesn't understand a message, it's the sender's fault.

We Don't Need These Formalities. Seriously, why would we spend time on writing a glossary if everybody understands all our terms without it? We've been working as a team for a few years, so we all know what DPI and FIFO are, and we know what "record" and "timing data" are. Why bother with the glossary, which will provide no additional business value for us? I've seen many technical meetings of very mature and "well-organized" teams burn hours of time on pointless discussions simply because of different understandings the same term. A glossary is not a formality; it's a key instrument of a software architect and all other team members.

We Prefer Working Software Over Comprehensive Documentation. This is what the Agile Manifesto says, and it's true. But the key word here is "comprehensive." We don't need comprehensive documentation, but we need a glossary. It's a key element in any project, and it simply can't be replaced by working software. No working software can help us understand what "header" and "data signal" are unless there is a simple and clear statement about it.

We Don't Have Time. We're developing too fast and brainstorming every day, so the concept is frequently changing. We simply don't have time to document our thoughts. We all understand each other, and that is the beauty of being agile. No, that is not a beauty. Instead, it is a lack of discipline and elementary management skill. A lack of a glossary is a personal fault of the software architect, and there are no excuses for it.

All Our Terms Are Well-Known to Everyone. Seriously, do we need to document what TCP/IP and FIFO are? That's what they teach us in school. Everyone understands that, don't they? Yes, some of the terms are well-known. But what is the problem of adding them to the glossary with a few words and a link to a Wikipedia article? This will only take a few minutes of an architect's time, but it will make life easier for everybody in the project, both now and in a few years from now.

To conclude, there is no excuse for the absence of a glossary in any software project. And it is the personal responsibility of a software architect to keep this document (or a chapter) up to date.

Hope I wasn't too obvious above :)

© Yegor Bugayenko 2014–2018

Don't Create Objects That End With -ER

QR code

Don't Create Objects That End With -ER

Manager. Controller. Helper. Handler. Writer. Reader. Converter. Validator. Router. Dispatcher. Observer. Listener. Sorter. Encoder. Decoder. This is the class names hall of shame. Have you seen them in your code? In open source libraries you're using? In pattern books? They are all wrong. What do they have in common? They all end in "-er." And what's wrong with that? They are not classes, and the objects they instantiate are not objects. Instead, they are collections of procedures pretending to be classes.

Fight Club (1999) by David Fincher
Fight Club (1999) by David Fincher

Peter Coad used to say: Challenge any class name that ends in "-er." There are a few good articles about this subject, including Your Coding Conventions Are Hurting You by Carlo Pescio, One of the Best Bits of Programming Advice I Ever Got by Travis Griggs, and Naming Objects – Don’t Use ER in Your Object Names by Ben Hall. The main argument against this "-er" suffix is that "when you need a manager, it's often a sign that the managed are just plain old data structures and that the manager is the smart procedure doing the real work."

I totally agree but would like to add a few words to this.

I mentioned already in Seven Virtues of a Good Object that a good object name is not a job title, but I didn't explain why I think so. Besides that, in Utility Classes Have Nothing to Do With Functional Programming, I tried to explain the difference between declarative and imperative programming paradigms. Now it's time to put these two pieces together.

Let's say I'm an object and you're my client. You give me a bucket of apples and ask me to sort them by size. If I'm living in the world of imperative programming, you will get them sorted immediately, and we will never interact again. I will do my job just as requested, without even thinking why you need them sorted. I would be a sorter who doesn't really care about your real intention:

List<Apple> sorted = new Sorter().sort(apples);
Apple biggest = sorted.get(0);

As you see here, the real intention is to find the biggest apple in the bucket.

This is not what you would expect from a good business partner who can help you work with a bucket of apples.

Instead, if I lived in the world of declarative programming, I would tell you: "Consider them sorted; what do you want to do next?." You, in turn, would tell me that you need the biggest apple now. And I would say, "No problem; here it is." In order to return the biggest one, I would not sort them all. I would just go through them all one by one and select the biggest. This operation is much faster than sorting first and then selecting the first in the list.

In other words, I would silently not follow your instructions but would try to do my business my way. I would be a much smarter partner of yours than that imperative sorter. And I would become a real object that behaves like a sorted list of apples instead of a procedure that sorts:

List<Apple> sorted = new Sorted(apples);
Apple biggest = sorted.get(0);

See the difference?

Pay special attention to the difference between the sorter and sorted names.

Let's get back to class names. When you add the "-er" suffix to your class name, you're immediately turning it into a dumb imperative executor of your will. You do not allow it to think and improvise. You expect it to do exactly what you want---sort, manage, control, print, write, combine, concatenate, etc.

An object is a living organism that doesn't want to be told what to do. It wants to be an equal partner with other objects, exposing behavior according to its contract(s), a.k.a. interfaces in Java and C# or protocols in Swift.

Philosophically speaking, the "-er" suffix is a sign of disrespect toward the poor object.

© Yegor Bugayenko 2014–2018

Team Morale: Myths and Reality

QR code

Team Morale: Myths and Reality

There are plenty of books, articles, and blog posts about team morale. They will all suggest you do things like regular celebrations, team events, free lunches, pet-friendly offices, coffee machines, birthday presents, etc. All of these are instruments of concealed enslaving. These traditional techniques turn employees into speechless monkeys, programming under the influence of Prozac. Their existence and popularity is our big misfortune. Let me present my own vision of how team morale can be boosted on a software team---a team that has a good project manager.

Apocalypto (2006) by Mel Gibson
Apocalypto (2006) by Mel Gibson

Fire Fast. The first and most important quality of a good manager is his or her ability to separate bad apples from good ones as soon as possible. Nothing will earn you more disrespect from your team than tolerance of under-performing team members. Your job as a manager is to help the best players play better, and they can't play better if they see that management doesn't understand the difference between excellence and mediocrity. It's a severe demotivating factor.

Be Honest About Problems and Risks. Your team is following you and expecting you to be a smart manager. While they are writing Java, you're talking to investors and customers. They want to be sure you know what you're doing. The best way to show them you have no idea where the team is going is to tell them that the future is bright and cloudless. Everybody understands that's either a lie and you are trying to hide risks or you're stupid enough to not see them. In either case, the best people would attempt to quit before it's too late. Thus, to keep morale up, regularly inform your people about problems you're facing and risks you're trying to prevent. They will appreciate it and respect you. A strong, professional manager deals with risks instead of ignoring them.

Failures Are Yours; Success Is Theirs. Always remember that when someone on your team makes a mistake, it is first of all your personal mistake. You hired that person, you trained him or her, you delegated the responsibility, and you controlled and monitored the job. Then he made a mistake, and the project lost money, disappointed a customer, or damaged the firm's reputation. Of course you need to take necessary disciplinary actions and maybe fire the troublemaker. But first of all, you have to admit in front of everyone that it was your personal mistake. You didn't control enough, you didn't plan well, or you didn't take preventive actions. This is what the team expects from you. Also, your people expect you to explain to them how you're going to learn from this mistake in order to prevent a similar one from happening in the future. A strong manager isn't afraid to look stupid in front of the team. A weak manager does look stupid when he or she tries to hide mistakes that have been made.

Responsibility Is Always Personal. The most demotivating word used in task descriptions is "together." Don't use it. Each task has to be personally and individually assigned (no matter what the Agile Manifesto says). Everybody is responsible for his or her own success or failure. How their results join together and lead to a mutual success or failure---that's your business. Whether you succeed or fail, we all will see. Once you say we all have to succeed together, the team understands that you're trying to shift responsibility from your own shoulders to theirs. It's a sign of weakness, and you lose respect. Make tasks and goals strictly personal, and be prepared to be responsible for the group's success. You, as a manager, break down an entire project into parts and delegate them to your people. If you do this job properly, we all will succeed. But don't try to blame us if the parts fall apart.

Don't Mention Steve Jobs. Try to avoid global slogans and world domination speeches in the office and in front of the team. They demotivate. If we're doing so good, why are our salaries not reflecting this success yet? If your vision is so global, why is it not yet implemented in reality? Don't promise to become the next Steve Jobs. Instead, become the next good manager of a highly paid team that is solving interesting problems for real people. Your practical achievements, no matter how small and down-to-earth they are, will give you much more respect than many-hour-long speeches about our fantastic future.

Don't Say a Word About Agile. Even though Agile is a great attitude-changing and mind-shifting concept, it is absolutely inapplicable in practice, mostly because it is too abstract. When you're proclaiming in the office that we should value "working software over comprehensive documentation," it sounds like you don't know what you're doing. The team doesn't need such abstract slogans from you. It needs specific instructions and rules in order to follow them and produce results, money, and satisfaction. Agile is a set of abstract principles that you should understand and digest. But then, after you chew them properly, convert them to specific and very unambiguous rules of work. Don't talk about Agile; be agile.

Don't Close the Door. Responsibility is personal, money is personal, and results are personal. But their discussions should be open to everybody. Don't close the door to that meeting room when you're talking about problems or appraising someone's results. You want your team to work together? Give everybody an assurance that none of them will be terminated behind a closed door. These pompous speeches about "us working together" usually turn into mush once the team sees that someone gets fired after a private conversation with a manager. Are we together, or is it you against us? To keep team morale up, you, as a manager, have to establish ground rules of work that will define who gets what when we succeed and who goes home first when we fail. These rules should be open to everybody. These rules should rule the team, not your personal decisions made behind a closed door.

Celebrate Achievements Instead of Birthdays. Team-building events are a great tool to boost team morale, but only when they are built around personal or team achievements instead of calendar events. A project team is not a group of friends or family members, even though some teams may feel like that. No matter how it feels, a team is here for one reason---to create the product and make money for its sponsor. This is the direction we're going. Our goal is not to build a community and live together til the end of our days. Our goal is to achieve the business success of the product we're developing, or in other words, complete the project. When the only events we're celebrating are our birthdays, that's a sign to us that our managers are trying to lie to us. They are pretending that we're here to make a community of friends while in reality they are using us to build their business. It's unhealthy and ruins team morale. Instead, celebrate achievements on your real path---to the success of the product under development. This will show everybody that you, as a manager, are honest with your people and ready to show them that their true role on the team is to develop a product and earn money for its investors. Honesty is the best team morale booster.

Don't Rule; Make Rules and Plans. Nothing demotivates more than an unpredictable manager. For the team, you are an abstraction of the entire world around the team. They see the reality through the prism of your personality. What you tell them about the reality is what they perceive. If you are unpredictable, the reality is unpredictable and scary for them. To avoid that, stop making decisions that are based on your personal and momentary judgment. Instead, make decisions that are based on the rules you've defined upfront and plans you've drawn beforehand. First, create a plan for team growth and announce it to everybody. The plan should include risks and their mitigation actions. The plan should say who will be fired first when or if the project goes down. The plan should give a predictable and measurable picture of the reality around your office. It should be a map of terrain you're going to cross with your team. When it's time to make a decision, everybody will understand why it's made and will respect you as a manager who predicted the situation and managed it professionally.

Put Money on the Table. Discuss money openly and freely, right in the office, right in front of everybody. This advice is for true professionals. If you can't do what is said above, don't even try this one. But if you consider yourself a real pro in management and leadership, you should put money on the table and let everybody know who is getting what, when, why, and why not. Everybody should know everybody's salaries, bonuses, benefits, and the rationale behind them. Each programmer should know what he or she should do in order to get a $5,000 raise to their annual salary. Also, he or she should know why a colleague is called "senior developer" while his or her title is still "junior." This information should be public and printed on the wall right behind your chair. Why don't most managers do this? Because they don't have any rationale behind their monetary decisions. Instead of managing the money, they let money manage them.


If you like this article, you will definitely like these very relevant posts too:

Command, Control, and Innovate
Command and control works in military organizations, but it apparently doesn't work in Silicon Valley; why is that?

How to Hire a Programmer
Finding and hiring a good programmer is a tricky task, and most entrepreneurs don't know how to do it right.

Competition Without Rules Is Destructive
When your team doesn't have rules of competition, you get competition without rules, which ruins team motivation.

© Yegor Bugayenko 2014–2018

Composable Decorators vs. Imperative Utility Methods

QR code

Composable Decorators vs. Imperative Utility Methods

  • comments

The decorator pattern is my favorite among all other patterns I'm aware of. It is a very simple and yet very powerful mechanism to make your code highly cohesive and loosely coupled. However, I believe decorators are not used often enough. They should be everywhere, but they are not. The biggest advantage we get from decorators is that they make our code composable. That's why the title of this post is composable decorators. Unfortunately, instead of decorators, we often use imperative utility methods, which make our code procedural rather than object-oriented.

Main picture

First, a practical example. Here is an interface for an object that is supposed to read a text somewhere and return it:

interface Text {
  String read();
}

Here is an implementation that reads the text from a file:

final class TextInFile implements Text {
  private final File file;
  public TextInFile(final File src) {
    this.file = src;
  }
  @Override
  public String read() {
    return new String(
      Files.readAllBytes(), "UTF-8"
    );
  }
}

And now the decorator, which is another implementation of Text that removes all unprintable characters from the text:

final class PrintableText implements Text {
  private final Text origin;
  public PrintableText(final Text text) {
    this.origin = text;
  }
  @Override
  public String read() {
    return this.origin.read()
      .replaceAll("[^\p{Print}]", "");
  }
}

Here is how I'm using it:

final Text text = new PrintableText(
  new TextInFile(new File("/tmp/a.txt"))
);
String content = text.read();

As you can see, the PrintableText doesn't read the text from the file. It doesn't really care where the text is coming from. It delegates text reading to the encapsulated instance of Text. How this encapsulated object will deal with the text and where it will get it doesn't concern PrintableText.

Let's continue and try to create an implementation of Text that will capitalize all letters in the text:

final class AllCapsText implements Text {
  private final Text origin;
  public AllCapsText(final Text text) {
    this.origin = text;
  }
  @Override
  public String read() {
    return this.origin.read().toUpperCase(Locale.ENGLISH);
  }
}

How about a Text that trims the input:

final class TrimmedText implements Text {
  private final Text origin;
  public TrimmedText(final Text text) {
    this.origin = text;
  }
  @Override
  public String read() {
    return this.origin.read().trim();
  }
}

I can go on and on with these decorators. I can create many of them, suitable for their own individual use cases. But let's see how they all can play together. Let's say I want to read the text from the file, capitalize it, trim it, and remove all unprintable characters. And I want to be declarative. Here is what I do:

final Text text = new AllCapsText(
  new TrimmedText(
    new PrintableText(
      new TextInFile(new File("/tmp/a.txt"))
    )
  )
);
String content = text.read();

First, I create an instance of Text, composing multiple decorators into a single object. I declaratively define the behavior of text without actually executing anything. Until method read() is called, the file is not touched and the processing of the text is not started. The object text is just a composition of decorators, not an executable procedure. Check out this article about declarative and imperative styles of programming: Utility Classes Have Nothing to Do With Functional Programming.

This design is much more flexible and reusable than a more traditional one, where the Text object is smart enough to perform all said operations. For example, class String from Java is a good example of a bad design. It has more than 20 utility methods that should have been provided as decorators instead: trim(), toUpperCase(), substring(), split(), and many others, for example. When I want to trim my string, uppercase it, and then split it into pieces, here is what my code will look like:

final String txt = "hello, world!";
final String[] parts = txt.trim().toUpperCase().split(" ");

This is imperative and procedural programming. Composable decorators, on the other hand, would make this code object-oriented and declarative. Something like this would be great to have in Java instead (pseudo-code):

final String[] parts = new String.Split(
  new String.UpperCased(
    new String.Trimmed("hello, world!")
  )
);

To conclude, I recommend you think twice every time you add a new utility method to the interface/class. Try to avoid utility methods as much as possible, and use decorators instead. An ideal interface should contain only methods that you absolutely cannot remove. Everything else should be done through composable decorators.

© Yegor Bugayenko 2014–2018

A Haircut

QR code

A Haircut

  • comments

I received a haircut today, and the niceness of my hairdresser led him to fill the appointment with courteous questions about how I wanted my hair cut, what size of clipper he should use, how long the sides should be, and how much should be removed from the front. He also offered me many types of shampoo and a cup of tea. All this reminded me of the work we do as programmers, and I decided to write a short post about it. I've already mentioned before that trying to make a customer happy is a false objective. This hairdresser was a perfect illustrative example of this very mistake. By the way, in the end, I wasn't happy, and he got no tip. How could this happen if he was so friendly and nice?

The Man Who Wasn't There (2001) by Coen Brothers
The Man Who Wasn't There (2001) by Coen Brothers

I'm not a hairdresser, and I have very little understanding of how to deal with hair. I came to him because I assumed he knew more about this business than I did. I chose him through the assistance of Yelp. I wanted him to tell me how long the hair on the sides should be and how much should be removed on the top. I expected him to give me his professional judgment and stand by it.

Instead of asking me how much I wanted removed on the sides, he should have told me there should be less on the sides. This is what a true professional would do. A true professional would give me his vision of the haircut that best suits me and would try to convince me that it was the best choice.

A true professional would not ask me but would tell me instead, because he would understand that my goal was not to boss him around and make him do my hair the way I wanted it. My goal was to get the best out of his professional skill.

Unfortunately, the guy was either weak or immature. He didn't argue with me and didn't try to convince me. He tried to please me. In the end, he lost.

Exactly the same thing happens when we ask our customers about the technologies they want us to use. I hear this question very often: What language do you want us to use (meaning Java or Ruby or something else)? Or what database should we use? Or how do you want us to design this?

Don't do that. Don't lose like that hairdresser. Don't ask your clients what they want. Instead, learn their business requirements and then suggest the solution you think is the best for them. Then, insist and argue if they don't agree. Convince them. Even if they fire you in the end for your stubbornness, it's better than being that hairdresser who is doomed to please every single client without getting anywhere further.

Remember, the client is not the king; his hairs are.

© Yegor Bugayenko 2014–2018

Utility Classes Have Nothing to Do With Functional Programming

QR code

Utility Classes Have Nothing to Do With Functional Programming

  • comments

I was recently accused of being against functional programming because I call utility classes an anti-pattern. That's absolutely wrong! Well, I do consider them a terrible anti-pattern, but they have nothing to do with functional programming. I believe there are two basic reasons why. First, functional programming is declarative, while utility class methods are imperative. Second, functional programming is based on lambda calculus, where a function can be assigned to a variable. Utility class methods are not functions in this sense. I'll decode these statements in a minute.

In Java, there are basically two valid alternatives to these ugly utility classes aggressively promoted by Guava, Apache Commons, and others. The first one is the use of traditional classes, and the second one is Java 8 lambda. Now let's see why utility classes are not even close to functional programming and where this misconception is coming from.

Color Me Kubrick (2005) by Brian W. Cook
Color Me Kubrick (2005) by Brian W. Cook

Here is a typical example of a utility class Math from Java 1.0:

public class Math {
  public static double abs(double a);
  // a few dozens of other methods of the same style
}

Here is how you would use it when you want to calculate an absolute value of a floating point number:

double x = Math.abs(3.1415926d);

What's wrong with it? We need a function, and we get it from class Math. The class has many useful functions inside it that can be used for many typical mathematical operations, like calculating maximum, minimum, sine, cosine, etc. It is a very popular concept; just look at any commercial or open source product. These utility classes are used everywhere since Java was invented (this Math class was introduced in Java's first version). Well, technically there is nothing wrong. The code will work. But it is not object-oriented programming. Instead, it is imperative and procedural. Do we care? Well, it's up to you to decide. Let's see what the difference is.

There are basically two different approaches: declarative and imperative.

Imperative programming is focused on describing how a program operates in terms of statements that change a program state. We just saw an example of imperative programming above. Here is another (this is pure imperative/procedural programming that has nothing to do with OOP):

public class MyMath {
  public double f(double a, double b) {
    double max = Math.max(a, b);
    double x = Math.abs(max);
    return x;
  }
}

Declarative programming focuses on what the program should accomplish without prescribing how to do it in terms of sequences of actions to be taken. This is how the same code would look in Lisp, a functional programming language:

(defun f (a b) (abs (max a b)))

What's the catch? Just a difference in syntax? Not really.

There are many definitions of the difference between imperative and declarative styles, but I will try to give my own. There are basically three roles interacting in the scenario with this f function/method: a buyer, a packager of the result, and a consumer of the result. Let's say I call this function like this:

public void foo() {
  double x = this.calc(5, -7);
  System.out.println("max+abs equals to " + x);
}
private double calc(double a, double b) {
  double x = Math.f(a, b);
  return x;
}

Here, method calc() is a buyer, method Math.f() is a packager of the result, and method foo() is a consumer. No matter which programming style is used, there are always these three guys participating in the process: the buyer, the packager, and the consumer.

Imagine you're a buyer and want to purchase a gift for your (girl|boy)friend. The first option is to visit a shop, pay $50, let them package that perfume for you, and then deliver it to the friend (and get a kiss in return). This is an imperative style.

The second option is to visit a shop, pay $50, and get a gift card. You then present this card to the friend (and get a kiss in return). When he or she decides to convert it to perfume, he or she will visit the shop and get it. This is a declarative style.

See the difference?

In the first case, which is imperative, you force the packager (a beauty shop) to find that perfume in stock, package it, and present it to you as a ready-to-be-used product. In the second scenario, which is declarative, you're just getting a promise from the shop that eventually, when it's necessary, the staff will find the perfume in stock, package it, and provide it to those who need it. If your friend never visits the shop with that gift card, the perfume will remain in stock.

Moreover, your friend can use that gift card as a product itself, never visiting the shop. He or she may instead present it to somebody else as a gift or just exchange it for another card or product. The gift card itself becomes a product!

So the difference is what the consumer is getting---either a product ready to be used (imperative) or a voucher for the product, which can later be converted into a real product (declarative).

Utility classes, like Math from JDK or StringUtils from Apache Commons, return products ready to be used immediately, while functions in Lisp and other functional languages return "vouchers." For example, if you call the max function in Lisp, the actual maximum between two numbers will only be calculated when you actually start using it:

(let (x (max 1 5))
  (print "X equals to " x))

Until this print actually starts to output characters to the screen, the function max won't be called. This x is a "voucher" returned to you when you attempted to "buy" a maximum between 1 and 5.

Note, however, that nesting Java static functions one into another doesn't make them declarative. The code is still imperative, because its execution delivers the result here and now:

public class MyMath {
  public double f(double a, double b) {
    return Math.abs(Math.max(a, b));
  }
}

"Okay," you may say, "I got it, but why is declarative style better than imperative? What's the big deal?" I'm getting to it. Let me first show the difference between functions in functional programming and static methods in OOP. As mentioned above, this is the second big difference between utility classes and functional programming.

In any functional programming language, you can do this:

(defun foo (x) (x 5))

Then, later, you can call that x:

(defun bar (x) (+ x 1)) // defining function bar
(print (foo bar)) // passing bar as an argument to foo

Static methods in Java are not functions in terms of functional programming. You can't do anything like this with a static method. You can't pass a static method as an argument to another method. Basically, static methods are procedures or, simply put, Java statements grouped under a unique name. The only way to access them is to call a procedure and pass all necessary arguments to it. The procedure will calculate something and return a result that is immediately ready for usage.

And now we're getting to the final question I can hear you asking: "Okay, utility classes are not functional programming, but they look like functional programming, they work very fast, and they are very easy to use. Why not use them? Why aim for perfection when 20 years of Java history proves that utility classes are the main instrument of each Java developer?"

Besides OOP fundamentalism, which I'm very often accused of, there are a few very practical reasons (BTW, I am an OOP fundamentalist):

  • Testability. Calls to static methods in utility classes are hard-coded dependencies that can never be broken for testing purposes. If your class is calling FileUtils.readFile(), I will never be able to test it without using a real file on disk.

  • Efficiency. Utility classes, due to their imperative nature, are much less efficient than their declarative alternatives. They simply do all calculations right here and now, taking processor resources even when it's not yet necessary. Instead of returning a promise to break down a string into chunks, StringUtils.split() breaks it down right now. And it breaks it down into all possible chunks, even if only the first one is required by the "buyer."

  • Readability. Utility classes tend to be huge (try to read the source code of StringUtils or FileUtils from Apache Commons). The entire idea of separation of concerns, which makes OOP so beautiful, is absent in utility classes. They just put all possible procedures into one huge .java file, which becomes absolutely unmaintainable when it surpasses a dozen static methods.

To conclude, let me reiterate: Utility classes have nothing to do with functional programming. They are simply bags of static methods, which are imperative procedures. Try to stay as far as possible away from them and use solid, cohesive objects no matter how many of them you have to declare and how small they are.

© Yegor Bugayenko 2014–2018

It's Not a School!

QR code

It's Not a School!

  • comments

At Zerocracy, we work in distributed teams and keep all our communications in tickets. Besides that, we encourage every developer on every project to report bugs whenever he or she finds them. We even pay for each bug found. Once in a while, I see bugs reported along these lines: "Can someone explain to me how to design this module?" or even "I haven't used this library before; please help me get started." My usual answer is, "This is not a school; nobody is going to teach you here!" I realize this sounds rather harsh to most developers who are just starting to work with us, so here I'll try to illustrate why such an attitude makes sense and is beneficial to both the programmers and the project.

Disclaimer: I'm talking about software projects here, which PMBOK defines as "temporary endeavors undertaken to create unique products, services, or results." If your team is engaged in continuous development or maintenance of software, this concept may not be relevant.

G.I. Jane (1997) by Ridley Scott
G.I. Jane (1997) by Ridley Scott

No matter how unpleasant this could be, let's face the reality: each software project is a business, and we, the developers, are its resources. Just like concrete, wood, and glass are the resources required to build a house, which is also a business activity. No matter how much we think about ourselves as a family having fun together and writing code because we enjoy it, each business looks at it completely differently.

The project needs us to produce classes, lines of code, methods, functions, files, and features. Then, the project can convert them into happy customers, which will give us something back---usually cash. Finally, the project will share that cash among us, investors, and the government.

A properly planned and managed project acquires the best resources its budget can afford and then relies on their quality. A programmer who doesn't have adequate skills or knowledge is an unreliable resource. Obviously, no project would acquire such a resource from the start. However, this weakness may be revealed in the middle of the project.

Say you're building a house and you contracted a plumber to install a drainage system. When it's time to mount the equipment, he tells you that he's never worked with it and doesn't know how to install it. It was a risk, and it occurred. A good project manager always has a fallback plan or even a few of them. Obviously, the best option would be to contract another plumber. The worst option would be to train the original one on the spot.

Wait, why is that so obvious? The plumber is a great guy. Yes, he doesn't know how to work with this equipment, but that doesn't mean we should fire him immediately. Let's pay for his training, send him to some courses, buy him some books, let him experiment with the equipment for some time, and then he will be able to install it in our house. Great plan, isn't it? The plumber will be happy.

But the project won't.

The goal of the project is to build a house, not to train a plumber. The project doesn't even have funds to train the bloody plumber! If we train and teach all our workers, we won't ever build a house. We're not running a school here. We're building a house!

While working on a software project, a good project manager has a staffing management plan that describes how skills will be recruited, tested, applied, and discharged when necessary. Such a plan may include training, but it would be as small an amount as possible---mostly because a plumber trained by us costs much more than one trained by someone else but does exactly the same, or worse, work. Thus, a smart project manager acquires project members who are already capable of performing their duties and falls back on training only in exceptional situations.

Now, a logical question: What should we, as programmers, do? We want to learn, and we don't want to spend our own money on it. We don't want to sit home for a few years learning all possible technologies before entering the job market as experts, ready to be hired. We want to learn on the job. We basically want to use project budgets for our own personal educational needs. Moreover, a smart programmer exits every project with some new knowledge, new skills, and new technologies in his or her portfolio.

However, if you make your projects spend their money on your education, that's theft. And a good project manager should stop you, saying "This is not a school!"

What is the solution?

I believe that in the software business, there is only one workaround---blame the project for your own deficiencies in education, skills, or information. I'm being absolutely serious. Let's discuss a few practical situations.

Say you have a module to work with, and it has to be written in Python. You have no experience in Python; you're a Java developer. Who is at fault here? You could think of it as your problem and ask your project manager to teach you, but he should tell you he's not running a school and get rid of you. That's a bad scenario for both of you. Instead, blame the project manager. He hired you. He put you into this situation. He planned all project activities, so he probably knows what he is doing. This means that the project documentation should be detailed enough for a Java developer to create that Python module. However, it is not detailed enough. So report this issue and wait for its resolution. Explain in your bug report that you honestly started to work with the module and realized that its documentation is not complete enough for a Java developer to understand. Ask the project manager to fix this. If the project decides to invest its money into the documentation, you have the chance to read it and learn. Thus, the project's money spent on your education will also contribute to the project. It's a win-win.

Here is another example: Say you have to fix a Java module and you're a Java developer. You understand Java, but you don't understand how this algorithm works. You could blame yourself for not reading Knuth in school and ask the project to train you on it. A good and strong project manager should tell you that it's not a school and get rid of you. Again, a bad scenario for both of you. Instead, blame the project. The code is not self-descriptive and is difficult to understand. The algorithm implementation is not obvious and is poorly documented. Ask for better documentation. If the project invests its money into the documentation, you will learn the algorithm from it. The source code will be improved, and you will have improved your skills. Win-win.

One more example: Say you are tasked to implement a WebSockets back-end in an existing web app. You know how WebSockets work but can't understand how to connect this new back-end to the existing persistence layer. You are rather new to the project and don't understand what would be the right design. You could ask for the project to provide training to explain how the code works and how it can be extended with features like this one. A project manager should tell you that you're not in school and are supposed to understand the software if the project is paying you a software developer salary. And he will be right. But it's a bad scenario for both of you. Instead, blame the project for incomplete design documentation. Good software should properly document its architecture and design. Ask for the project to provide such documentation. If it invests its time and money into better documentation, you will learn from it and find all the necessary answers. Another win-win.

There are a few other examples in my How to Cut Corners and Stay Cool post.

In conclusion, I would recommend you remember that software projects are, first and foremost, business activities where we, the developers, are resources. In order to obtain something for ourselves in terms of education and training, we should align our goals with project objectives. Instead of asking for help and information, we should blame the project for its lack of documentation. By fixing this flaw, the project will improve its artifacts and, at the same time, provide valuable knowledge to us, its participants.

© Yegor Bugayenko 2014–2018

Code For the User, Not for Yourself

QR code

Code For the User, Not for Yourself

  • comments

First, no matter what the methodology is, we all write software for our users (a.k.a. customers, project sponsors, end users, or clients). Second, no matter what the methodology is, we write incrementally, releasing features and bug fixes one by one. Maybe I'm saying something absolutely obvious here, but it's important to remember that each new version should first of all satisfy the needs of the user, not of us programmers. In other words, the way we decompose a big task into smaller pieces should be user-targeted, and that's why you always work top down. Let's see what I mean through a practical example.

Delicatessen (1991) by Jean-Pierre Jeunet
Delicatessen (1991) by Jean-Pierre Jeunet

Say I'm contracted by a friend of mine to create a word-counting command line tool very similar to wc. He promised to pay me $200 for this work, and I promised him I'd deliver the product in two increments---an alpha and beta version. I promised him I'd release the alpha version on Saturday and the beta version on Sunday. He is going to pay me $100 after the first release and the rest after the second release.

I'll write in C, and he will pay in cash.

The tool is very primitive, and it only took me a few minutes to write. Take a look at it:

#include <stdio.h>
#include <unistd.h>
int main() {
  char ch;
  int count = 0;
  while (1) {
    if (read(STDIN_FILENO, &ch, 1) <= 0) {
      break;
    }
    if (ch == ' ') {
      ++count;
    }
  }
  if (count > 0) {
    ++count;
  }
  printf("%d\n", count);
  return 0;
}

But let's be professional and not forget about build automation and unit testing. Here is a simple Makefile that does them both:

all: wc test
wc: wc.c
  gcc -o wc wc.c
test: wc
  echo '' | ./wc | grep '0'
  echo 'Hello, world! How are you?' | ./wc | grep '5'

Now I run make from a command line and get this output:

$ make
echo '' | ./wc | grep '0'
0
echo 'Hello, world! How are you?' | ./wc | grep '5'
5

All clean!

I'm ready to get my $200. Wait, the deal was to deliver two versions and get cash in two installments. Let's back up a little and think---how can we break this small tool into two parts?

On first thought, let's release the tool itself first and build automation and testing next. Is that a good idea? Can we deliver any software without running it first with a test? How can I be sure that it works if I don't ship tests together with it? What will my friend think about me releasing anything without tests? This would be a total embarrassment.

Okay, let's release Makefile first and wc.c next. But what will my friend do with a couple of tests and no product in hand? This first release will be absolutely pointless, and I won't get my $100.

Now we're getting to the point of this article. What I'm trying to say is that every new increment must add some value to the product as it is perceived by the customer, not by us programmers. The Makefile is definitely a valuable artifact, but it provides no value to my friend. He doesn't need it, but I need it.

Here is what I'm going to do. I'll release a skeleton of the tool, backed by the tests but with an absolutely dummy implementation. Look at it:

#include <stdio.h>
int main() {
  printf("5\n");
  return 0;
}

And I will modify the Makefile accordingly. I will disable the first test to make sure the build passes.

Does my tool work? Yes, it does. Does it count words? Yes, it does for some inputs. Does it have value to my friend. Obviously! He can run it from the command line, and he can pass a file as an input. He will always get number "5" as a result of counting, though. That's a bummer, but it's an alpha version. He doesn't expect it to work perfectly.

However, it works, it is backed by tests, and it is properly packaged.

What I just did is a top-down approach to design. First of all, I created something that provides value to my customer. I made sure it also satisfies my technical objectives, like proper unit test coverage and build automation. But the most important goal for me was to make sure my friend received something ... and paid me.

© Yegor Bugayenko 2014–2018

Four NOs of a Serious Code Reviewer

QR code

Four NOs of a Serious Code Reviewer

  • comments

Code reviews (a.k.a. peer reviews) must be a mandatory practice for every serious software development team. I hope there is no debate about this. Some do pre-merge code reviews, protecting their master/development branch from accidental mistakes. Others do post-merge regular reviews to discover bugs and inconsistencies after they are introduced by their authors. Some even do both, reviewing before merges and regularly after. Code reviews are very similar to a white-box testing technique where a tester looks for defects with full access to the sources of the software. In either case, a code review is a great instrument to increase quality and boost team motivation.

However, it's not so simple to do them right. I would even say it's very easy and comfortable to do them wrong. Most code reviews and reviewers I've seen make similar mistakes. That's why I decided to summarize the four basic principles of a good reviewer as I see them. Hopefully you find them helpful.

Main picture

No Fear

There are a few different fears a serious code reviewer should renounce. The first and most popular is the fear of offending an author of the code. "I'd better close my eyes and pretend I didn't see her bugs today so tomorrow she will ignore my mistakes"---This is the kind of attitude this fear produces. Needless to say, it's counterproductive and degrades code quality and team morale. Here is my advice: be direct, honest, and straight-forward. If you don't like the code, you don't like it. You shouldn't care how your opinion will be taken by the author of the code.

If you do have such feelings toward your colleagues, there is something wrong with the management model. You're afraid of being rejected by the team for "not being a team player," which is a label attached to you by the weakest members of the team, not by the project sponsor. The sponsor pays you to produce high-quality software. The sponsor doesn't care how much your intention to increase quality offends others, who care less. The sponsor wants his money to produce deliverables that can be sold to customers and returned back in profit. The sponsor is not paying you to make friends in the office.

The next type of fear sounds like this: "If I reject this code, I will delay the release" Again, it goes without saying that such an attitude does a significant disservice to the entire project. You will accept the code and close your eyes to what you don't like in it. The code will go into the next release, and we'll ship it sooner. You won't be a bottleneck, and nobody will say that because of that dogmatic code reviewer, we delayed the release and lost some cash. You will be a good team player, right? Wrong!

As I've mentioned before, a professional team player understands his or her personal role in a project and doesn't cover anyone's ass. If the rejection of bad code delays the release, that's the fault of its author. Your professional responsibility is to make this fault visible. That's how you help the team learn and improve.

I think it's obvious that the education and improvement of a team first requires getting rid of bad programmers and promoting good ones. Honest and fearless code reviewers help the team learn and improve.

Yet another fear is expressed like this: "I may be wrong and they will laugh me out" Even worse, they may spot my lack of knowledge. They may see that I don't know what I'm doing. It would be better to stay quiet and pretend there are no bugs in the code. At least then I wouldn't embarrass myself with stupid comments. You know that it's much easier to look smart if you keep your mouth shut, right? Wrong!

The project is not paying you to look good. You're getting your paychecks not because the team loves you but because you produce deliverables on a daily basis. Your professional responsibility is to do what's best for the project and ignore everyone's opinions, including the opinion of your boss. Similar to A Happy Boss Is a False Objective, I would say that the respect of the team is a false goal. Don't try to earn respect. Instead, create clean code and respect will come automatically.

Let me reiterate: Don't be afraid to embarrass yourself by making incorrect and stupid comments about someone's code. Be loyal to the project, not to the expectations of people around you. They expect you to be smart and bright, but the project expects you to say what you think about the code. So screw their opinions; do the right thing and say what you really think.

No Compromise

Okay, you've fearlessly said what you thought about the code and simply rejected it. The branch you were reviewing is not good, and you explained why. You asked its author to refactor here and re-write there. What's next?

He or she will try to make a deal with you. It's natural and it's happening in almost every branch I'm seeing in our teams. The author of the code is also a professional developer, and he also has no fear. So he insists that his implementation approach is right and your ideas are wrong. What should a professional code reviewer do in this case?

The worst thing (as in any conflict resolution) is a compromise. This is what ruins quality faster than bad code. A compromise is a conflict resolution technique for which both parties agree not to get what they wanted just for the sake of ceasing the conflict. In other words, "Let's make peace just to stop fighting" It's the worst approach ever.

Instead of a lousy compromise, there are three professional exits from a fight over a piece of code:

  • "You're right; I take my comments back!" This may happen, and it should happen very often. You should be ready to admit your mistakes. You didn't like the code, but its author explained to you its benefits, and you accepted the logic---not because you want to stop fighting with him but because you really understood the logic and accepted it. Willingness to say, "I'm wrong," is the first sign of a mature and serious developer.

  • "I will never accept this, period!" Some code deserves that, and there is nothing wrong with resolving a conflict this way. The opponent may accept the situation and re-write everything. And learn something too.

  • "Let's do what the architect says!" In every project, there is a software architect who makes final decisions. Appeal to his opinion and get his final decision. Invite him into the discussion, and ask him to take one side in the conflict. Once he tells you that you're wrong, accept the decision and try to learn from it.

Thus, either stand strong on your position and fight for it or admit that you're wrong. One way or the other. But don't make a compromise!

Don't get me wrong; it's not about being stubborn and holding your cards no matter how bad they are. Be flexible and learn on the spot. Your position may and should change during the negotiation, but don't accept anything that you don't like. You can exit the conflict either by being fully convinced that the opponent is right or when the architect says so. Nothing in between.

No Bullshit

Again, you fearlessly said that a method should be designed differently. Your opponent, the author of the code, replies that he doesn't think so. You take a look again and decide to stand behind your position. You still think you're right, and you're not going to make a compromise. Now what? It's too early to call an architect, so try to convince your opponent.

In most cases, convincing is teaching. You know something that he doesn't know. That's why he created that method the way you don't like. One of you needs some additional education. Here is an opportunity for you to be a teacher of your colleague. To be an effective teacher, you need to show proof. You need to ground your logic and make sure he understands and accepts it.

Be ready to show links, articles, books, reports, examples, etc. Just "I know this because I've been writing Java for 15 years" is not enough. Moreover, this type of argument only makes you less convincing.

If you don't have enough convincing proof, think again---maybe you are wrong.

Also, remember that it's your job to prove that the code you're reviewing is bad. The author of the code should not prove anything. His code is great by default. The job of the reviewer is to show why and how that's actually not the case. In other words, you're the plaintiff and he is the defender. Not the other way around.

No Offense

This is the last and most difficult principle to follow. No matter how bad the code is and how stubborn your opponent is, you must remain professional. To be honest, I find this very difficult sometimes. At Zerocracy, we're working in distributed teams and hire a few new people every week. Some of them, despite our screening criteria, appear to be rather stupid difficult to deal with.

I encountered a funny situation about a year ago when a new guy was supposed to create a small (20 to 30 lines of code) new feature in an existing Java library. He sent me a pull request (I was a code reviewer) after he put in a few hundred lines of code. That code was absolute garbage and obviously not written by him. I immediately understood that he found it somewhere and copied it. But what could I do? How could I reject it without saying his attitude was unacceptable for a professional developer? I had to spend some time objectively blaming his code for its style, its design, etc. I had to make many serious comments in order to show him that to fix it all, he should delete the garbage and re-write it from scratch. I never saw him again after that task.

My point is that it's easy to be professional when you're dealing with professionals. Unfortunately, that's not always the case. But no matter how bad the code in front of you is, be patient and convincing.

© Yegor Bugayenko 2014–2018

Don't Repeat Yourself in Maven POMs; Use Jcabi-Parent

QR code

Don't Repeat Yourself in Maven POMs; Use Jcabi-Parent

  • comments
badge

Maven is a build automation tool mostly for Java projects. It's a great tool, but it has one important drawback that has motivated the creation of similar tools, like Gradle and SBT. That weakness is its verbosity of configuration. Maven gets all project build parameters from pom.xml, an XML file that can get very long. I've seen POM files of 3,000-plus lines. Taking into account 1) recent DSL buzz and 2) fear of XML, it's only logical that many people don't like Maven because of its pom.xml verbosity.

But even if you're an XML fan who enjoys its strictness and elegance (like myself), you won't like the necessity to repeat yourself in pom.xml for every project. If you're working on multiple projects, code duplication will be enormous. An average Java web app uses a few dozen standard Maven plugins and almost the same number of pretty common dependencies, like JUnit, Apache Commons, Log4J, Mockito, etc. All of them have their versions and configurations, which have to be specified if you want to keep the project stable and avoid Maven warnings. Thus, once a new version of a plugin is released, you have to go through all pom.xml files in the projects you're working on and update it there. You obviously understand what code duplication means. It's a disaster. However, there is a solution.

jcabi-parent is a very simple Maven dependency with nothing inside it except a large pom.xml with multiple pre-configured dependencies, profiles, and plugins. All you need to do in order to reuse them all in your project is define com.jcabi:parent as your parent POM:

<project xmlns="http://maven.apache.org/POM/4.0.0"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
    http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>com.jcabi</groupId>
    <artifactId>parent</artifactId>
    <!-- check the latest version at http://parent.jcabi.com -->
    <version>0.32.1</version>
  </parent>
  [...]
</project>

That's all you need. Now you can remove most of your custom configurations from pom.xml and rely on defaults provided by jcabi-parent. Its pom.xml is rather large and properly configured. Multiple projects depend on it, so you can be confident that you're using the best possible configuration of all standard plugins.

Here are a few examples of pom.xml from projects that are using jcabi-parent (you can see how compact they are): Xembly ReXSL jcabi-http Qulice

© Yegor Bugayenko 2014–2018

XSL Transformation in Java: An Easy Way

QR code

XSL Transformation in Java: An Easy Way

  • comments
badge

XSL transformation (XSLT) is a powerful mechanism for converting one XML document into another. However, in Java, XML manipulations are rather verbose and complex. Even for a simple XSL transformation, you have to write a few dozen lines of code---and maybe even more than that if proper exception handling and logging is needed. jcabi-xml is a small open source library that makes life much easier by enabling XML parsing and XPath traversing with a few simple methods. Let's see how this library helps in XSL transformations.

First, take a look at a practical example---rultor.com---a hosted DevOps assistant that automates release, merge, and deploy operations. Rultor keeps each conversation session with an end user (a.k.a. "talk") in a DynamoDB record. There are multiple situations to handle in each talk; that's why using multiple columns of a record is not really feasible. Instead, we're keeping only a few parameters of each talk in record columns (like ID and name) and putting all the rest in a single XML column.

This is approximately how our DynamoDB table looks:

+----+---------------+--------------------------------------+
| id | name          | xml                                  |
+----+---------------+--------------------------------------+
| 12 | jcabi-xml#54  | <?xml version='1.0'?>                |
|    |               | <talk public="true">                 |
|    |               |   <request id="e5f4b3">...</request> |
|    |               | </talk>                              |
+----+---------------+--------------------------------------+
| 13 | jcabi-email#2 | <?xml version='1.0'?>                |
|    |               | <talk public="true">                 |
|    |               |   <daemon id="f787fe">...</daemon>   |
|    |               | </talk>                              |
+----+---------------+--------------------------------------+

Once a user posts @rultor status into a GitHub ticket, Rultor has to answer with a full status report about the current talk. In order to create such a text answer (a regular user would not appreciate an XML response), we have to fetch that xml column from the necessary DynamoDB record and convert it to plain English text.

Here is how we're doing that with the help of jcabi-xml and its class, XSLDocument.

final String xml = // comes from DynamoDB
final XSL xsl = new XSLDocument(
  this.getClass().getResourceAsStream("status.xsl")
);
final String text = xsl.applyTo(xml);

That's it. Now let's see what's there in that status.xsl file (this is just a skeleton of it; the full version is here):

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0">
  <xsl:output method="text"/>
  <xsl:template match="/talk">
    <xsl:text>Hi, here is your status report:</xsl:text>
    ...
  </xsl:template>
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

It is good practice to create XSL documents only once per application run. We have a static utility method XSLDocument.make() for this:

final class Foo {
  private static final XSL STYLESHEET = XSLDocument.make(
    Foo.class.getResourceAsStream("stylesheet.xsl")
  );
  public XML style(final XML xml) {
    return Foo.STYLESHEET.transform(xml);
  }
}

Pay attention to the fact we're using XSLT 2.0. Built-in Java implementation of XSLT doesn't support version 2.0, and in order to make it work, we're using these two Maven Saxon dependencies:

<dependency>
  <groupId>net.sourceforge.saxon</groupId>
  <artifactId>saxon</artifactId>
  <version>9.1.0.8</version>
  <scope>runtime</scope>
</dependency>
<dependency>
  <groupId>net.sourceforge.saxon</groupId>
  <artifactId>saxon</artifactId>
  <version>9.1.0.8</version>
  <classifier>xpath</classifier>
  <scope>runtime</scope>
</dependency>

All you need to do to start using jcabi-xml for XSL transformations is add this dependency to your pom.xml:

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-xml</artifactId>
</dependency>

If you have any problems or suggestions, don't hesitate to submit an issue to the GitHub issue tracker.

© Yegor Bugayenko 2014–2018

Making Your Boss Happy Is a False Objective

QR code

Making Your Boss Happy Is a False Objective

  • comments

We all have bosses. We also have customers who pay us for running their software projects. They are my bosses for the time of the contract. I'm also acting as a boss for developers who are working for Zerocracy. It is obvious that a good employee/contractor is one who makes his boss/customer happy. But only a bad employee works toward this goal. Trying to make your boss happy is a false target that, if pursued, ruins the project. A professional employee works for the project, not for the boss.

The Million Dollar Hotel (2000) by Wim Wenders
The Million Dollar Hotel (2000) by Wim Wenders

We all work on projects as developers, designers, programmers, managers, testers, you name it. The boss is also a member of the project. More formally, he or she is a stakeholder, same as every one of us. Each stakeholder has his own needs for the project: 1) Jeff, the developer, wants to learn Scala and collect his paychecks every two weeks; 2) Sally, the product owner, wants to attend an expo in Paris and also collect her paychecks; 3) Bob, the CTO, wants to raise round A funding and collect a big paycheck; etc.

The project has its own objectives, to achieve 1 million downloads in less than six months and under $300,000, for example. This is what the project works for. This is what all of us are here for.

Our personal needs may be fully satisfied while we're all working toward this goal, or some of them may be sacrificed. I mean all of us, including the boss, whoever he or she is, either a CTO, a co-founder, a project manager, or a team lead.

The project is the source of our checks. Not the CFO.

The CFO is a stakeholder, like everyone else. The project gives him more power than others because it's necessary for the whole mechanism to work properly. Every project member has his or her own roles and responsibilities. I write code; the CFO writes checks. I eat at McDonald's; he drives a Jaguar. We have different needs, and we both agreed that the project would satisfy them. Otherwise we wouldn't be here, right?

We're all parts of a mechanism called a "project," which works according to the rules and principles of project management whether we are aware of them or not. Whether we have a project manager or not. Even if we violate all of them and manage ourselves in total chaos, we still have a scope, cost, schedule, and all other attributes of project management.

A professional and savvy boss understands that his role in the mechanism is to clearly define project objectives and make sure everybody's needs are aligned with those objectives. In a properly managed and organized project, everybody sees and feels how his or her personal needs are satisfied when the project achieves its objectives: Jeff learns Scala, Sally sees Paris, and Bob buys a new house.

However, if Jeff wants to learn Scala and we're developing an iOS application, that is a problem for the boss to resolve. Either convince Jeff to fall in love with Swift (I doubt that's possible) or replace him with someone who is already in love with it. It's clear that a professional boss will resort to such a tragic act as firing Jeff not because of his personal feelings towards Jeff but because they are both working toward the project objectives. Jeff and the boss will both understand that Jeff's need to learn Scala is not aligned with the objective of the project.

It is the CTO's responsibility to do something about Jeff when his personal needs become misaligned with the objectives of the project that is paying his salary. A professional CEO understands that and always acts in the best interest of the project, not of himself or anyone else personally.

I believe a professional team player does two things: obeys and resists.

First, you have to understand that the boss is here in order to help you organize your time, your tasks, your communications, your plans, etc. He knows more about the project and uses that information to help you do your job. Your real boss is the project; the boss you interact with is just a hired manager who translates project objectives into plans, instructions, schedules, etc.

This boss is your colleague who does management while you're writing code. You're both equal. You and he are in the same boat. Your functions are different than his; that's all. You're not working for him but with him on a project. A true professional team player feels himself equal to all other members of the project, no matter how high they are in the hierarchy.

At the same time, he strictly follows the process and obeys all project rules and instructions, not because he is afraid of being fired but because he wants the project to succeed.

Second, being a professional team player requires a constant readiness to resist each and every instruction if you feel it contradicts the project objectives. A true professional doesn't work for a boss. He doesn't want to make the boss happy. He actually doesn't care whether the boss is happy or not. He knows that the real boss is the project and tries to make the project successful and ... happy.

A true professional always works for himself. Jeff wants to learn Scala and earn a certain amount of cash. He joined the project in order to satisfy these needs. If the project fails, Jeff won't get the money and won't fully learn Scala. So if the boss tells Jeff to do something that may jeopardize the project's success, will Jeff do it? Does he care about disappointing the boss? Absolutely not. All he cares about is the project's success, which translates to his personal success.

Thus, making your boss happy is a goal for the immature, fearsome, lazy, and weak. Making your project successful is an objective for professional, strong, mature, and brave team players.

© Yegor Bugayenko 2014–2018

If. Then. Throw. Else. WTF?

QR code

If. Then. Throw. Else. WTF?

  • comments

This is the code I could never understand:

if (x < 0) {
  throw new Exception("X can't be negative");
} else {
  System.out.println("X is positive or zero");
}

I have been trying to find a proper metaphor to explain its incorrectness. Today I finally found it.

If-then-else is a forking mechanism of procedural programming. The CPU either goes to the left and then does something or goes to the right and does something else. Imagine yourself driving a car and seeing this sign:

The figure

It looks logical, doesn't it? You can go in the left lane if you're not driving a truck. Otherwise you should go in the right lane. Both lanes meet up in a while. No matter which one you choose, you will end up on the same road. This is what this code block does:

if (x < 0) {
  System.out.println("X is negative");
} else {
  System.out.println("X is positive or zero");
}

Now, try to imagine this sign:

The figure

It looks very strange to me, and you will never see this sign anywhere simply because a dead end means an end, a full stop, a finish. What is the point of drawing a lane after the dead end sign? There is no point.

This is how a proper sign would look:

The figure

This is how a proper code block would look:

if (x < 0) {
  throw new Exception("X can't be negative");
}
System.out.println("X is positive or zero");

The same is true for loops. This is wrong:

for (int x : numbers) {
  if (x < 0) {
    continue;
  } else {
    System.out.println("found positive number");
  }
}

While this is right:

for (int x : numbers) {
  if (x < 0) {
    continue;
  }
  System.out.println("found positive number");
}

There is no road after the dead end! If you draw it, your code looks like this very funny snippet I found a few years ago reviewing sources written by some very well-paid developer in one very serious company:

if (x < 0) {
  throw new Exception("X is negative");
  System.exit(1);
}

Don't do this.

© Yegor Bugayenko 2014–2018

How to Cut Corners and Stay Cool

QR code

How to Cut Corners and Stay Cool

  • comments

You have a task assigned to you, and you don't like it. You are simply not in the mood. You don't know how to fix that damn bug. You have no idea how that bloody module was designed, and you don't know how it works. But you have to fix the issue, which was reported by someone who has no clue how this software works. You get frustrated and blame that stupid project manager and programmers who were fired two years ago. You spend hours just to find out how the code works. Then even more hours trying to fix it. In the end, you miss the deadline and everybody blames you. Been there, done that?

Regarding Henry (1991) by Mike Nichols
Regarding Henry (1991) by Mike Nichols

There is, however, an alternative approach that provides a professional exit from this situation. Here are some tips I recommend to my peers who code with me in Zerocracy projects. In a nutshell, I'm going to explain how you can cut corners and remain professional, 1) protecting your nerves, 2) optimizing your project's expenses, and 3) increasing the quality of the source code.

Here is a list of options you have, in order of preference. I would recommend you start with the first one on the list and proceed down when you have to.

Create Dependencies, Blame Them, and Wait

This is the first and most preferable option. If you can't figure out how to fix an issue or how to implement a new feature, it's a fault of the project, not you. Even if you can't figure it out because you don't know anything about Ruby and they hired you to fix bugs in a Ruby on Rails code base---it's their fault. Why did they hire you when you know nothing about Ruby?

So be positive; don't blame yourself. If you don't know how this damn code works, it's a fault of the code, not you. Good code is easy to understand and maintain.

badge

Don't try to eat spaghetti code; complain to the chef and ask him or her to cook something better (BTW, I love spaghetti).

How can you do that? Create dependencies---new bugs complaining about unclear design, lack of unit tests, absence of necessary classes, or whatever. Be creative and offensive---in a constructive and professional way, of course. Don't get personal. No matter who cooked that spaghetti, you have nothing against him or her personally. You just want another dish, that's all.

Once you have those dependencies reported, explain in the main ticket that you can't continue until all of them are resolved. You will legally stop working, and someone else will improve the code you need. Later, when all dependencies are resolved and the code looks better, try to get back to it again. If you still see issues, create new dependencies. Keep doing this until the code in front of you is clean and easy to fix.

Don't be a hero---don't rush into fixing the bad code you inherited. Think like a developer, not a hacker. Remember that your first and most important responsibility as a disciplined engineer is to help the project reveal maintainability issues. Who will fix them and how is the responsibility of a project manager. Your job is to reveal, not to hide. By being a hero and trying to fix everything in the scope of a single task, you're not doing the project a favor---you're concealing the problem(s).

Edit: Another good example of a dependency may be a question raised at, for example, StackOverflow or a user list of a third-party library. If you can't find a solution yourself and the problem is outside of the scope of your project---submit a question to SO and put its link to the source code (in JavaDoc block, for example).

Demand Better Documentation and Wait

All dependencies are resolved and the code looks clean, but you still don't understand how to fix the problem or implement a new feature. It's too complex. Or maybe you just don't know how this library works. Or you've never done anything like that before. Anyhow, you can't continue because you don't understand. And in order to understand, you will need a lot of time---much more than you have from your project manager or your Scrum board. What do you do?

badge

Again, think positively and don't blame yourself. If the software is not clear enough for a total stranger, it's "their" fault, not yours. They created the software in a way that's difficult to digest and modify. But the code is clean; it's not spaghetti anymore. It's a perfectly cooked lobster, but you don't know how to eat lobster! You've never ate it before.

The chef did a good job; he cooked it well, but the restaurant didn't give you any instructions on how to eat such a sophisticated dish. What do you do?

You ask for a manual. You ask for documentation. Properly designed and written source code must be properly documented. Once you see that something is not clear for you, create new dependencies that ask for better documentation of certain aspects of the code.

Again, don't be a hero and try to understand everything yourself. Of course you're a smart guy, but the project doesn't need a single smart guy. The project needs maintainable code that is easy to modify, even by someone who is not as smart as yourself. So do your project a favor: reveal the documentation issue, and ask someone to fix it for you. Not just for you, for everybody. The entire team will benefit from such a request. Once the documentation is fixed, you will continue with your task, and everybody will get source code that is a bit better than it was before. Win-win, isn't it?

Reproduce the Bug and Call It a Day

Now the code is clean, the documentation is good enough, but you're stuck anyway. What to do? Well, I'm a big fan of test-driven development, so my next suggestion would be to create a test that reproduces the bug. Basically, this is what you should start every ticket with, be it a bug or a feature. Catch the bug with a unit test! Prove that the bug exists by failing the build with a new test.

badge

This may be rather difficult to achieve, especially when the software you're trying to fix or modify was written by idiots someone who had no idea about unit testing. There are plenty of techniques that may help you find a way to make such software more testable. I would highly recommend you read Working Effectively with Legacy Code by Michael Feathers. There are many different patterns, and most of them work.

Once you manage to reproduce the bug and the build fails, stop right there. That's more than enough for a single piece of work. Skip the test (for example, using @Ignore annotation in JUnit 4) and commit your changes. Then add documentation to the unit test you just created, preferably in the form of a @todo. Explain there that you managed to reproduce the problem but didn't have enough time to fix it. Or maybe you just don't know how to fix it. Be honest and give all possible details.

I believe that catching a bug with a unit test is, in most cases, more than 80% of success. The rest is way more simple: just fix the code and make the test pass. Leave this job to someone else.

Prove a Bug's Absence

Very often you simply can't reproduce a bug. That's not because the code is not testable and can't be used in a unit test but because you can't reproduce an error situation. You know that the code crashes in production, but you can't crash it in a test. The error stack trace reported by the end user or your production logging system is not reproducible. It's a very common situation. What do you do?

badge

I think the best option here is to create a test that will prove that the code works as intended. The test won't fail, and the build will remain clean. You will commit it to the repository and ... report that the problem is solved. You will say that the reported bug doesn't really exist in real life. You will state that there is no bug---"our software works correctly; here is the proof: see my new unit test."

Will they believe you? I don't think so, but they don't have a choice. They can't push you any further. You've already done something---created a new test that proves everything is fine. The ticket will be closed and the project will move on.

If, later on, the same problem occurs in production, a new bug will be reported. It will be linked to your ticket. Your experience will help someone investigate the bug further. Maybe that guy will also fail to catch the bug with a test and will also create a new, successful and "useless" test. And this may happen again and again. Eventually, this cumulative group experience will help the last guy catch the bug and fix it.

Thus, a new passing test is a good response to a bug that you can't catch with a unit test.

Disable the Feature

Sometimes the unit test technique won't work, mostly because a bug will be too important to be ignored. They won't agree with you when you show them a unit test that proves the bug doesn't exist. They will tell you that "when our users are trying to download a PDF, they get a blank page." And they will also say they don't really care about your bloody unit tests. All they care about is that PDF document that should be downloadable. So the trick with a unit test won't work. What do you do?

It depends on many factors, and most of these factors are not technical. They are political, organizational, managerial, social, you name it. However, in most cases, I would recommend you disable that toxic feature, release a new version, and close the ticket.

You will take the problem off your shoulders and everybody will be pleased. Well, except that poor end user. But this is not your problem. This is the fault of management, which didn't organize pre-production testing properly. Again, don't take this blame on yourself. Your job is to keep the code clean and finish your tickets in a reasonable amount of time. Their job is to make sure that developers, testers, DevOps, marketers, product managers, and designers work together to deliver the product with an acceptable number of errors.

Production errors are not programmers' mistakes, though delayed tickets are. If you keep a ticket in your hands for too long, you become an unmanageable unit of work. They simply can't manage you anymore. You're doing something, trying to fix the bug, saying "I'm trying, I'm trying ..." How can they manage such a guy? Instead, you should deliver quickly, even if it comes at the cost of a temporarily disabled feature.

Say No

OK, let's say none of the above works. The code is clean, the documentation is acceptable, but you can't catch the bug, and they don't accept a unit test from you as proof of the bug's absence. They also don't allow you to disable a feature, because it is critical to the user experience. What choices do you have? Just one.

badge

Be professional and say "No, I can't do this; find someone else." Being a professional developer doesn't mean being able to fix any problem. Instead, it means honesty. If you see that you can't fix the problem, say so as soon as possible. Let them decide what to do. If they eventually decide to fire you because of that, you will remain a professional. They will remember you as a guy who was honest and took his reputation seriously. In the end, you will win.

Don't hold the task in your hands. The minute you realize you're not the best guy for it or you simply can't fix it---notify your manager. Make it his problem. Actually, it is his problem in the first place. He hired you. He interviewed you. He decided to give you this task. He estimated your abilities and your skills. So it's payback time.

Your "No!" will be very valuable feedback for him. It will help him make his next important management decisions.

On the other hand, if you lie just to give the impression you're a guy who can fix anything and yet fail in the end, you will damage not only your reputation but also the project's performance and objectives.

© Yegor Bugayenko 2014–2018

A Compound Name Is a Code Smell

QR code

A Compound Name Is a Code Smell

  • comments

Do you name variables like textLength, table_name, or current-user-email? All three are compound names that consist of more than one word. Even though they look more descriptive than name, length, or email, I would strongly recommend avoiding them. I believe a variable name that is more complex than a noun is a code smell. Why? Because we usually give a variable a compound name when its scope is so big and complex that a simple noun would sound ambiguous. And a big, complex scope is an obvious code smell.

The Meaning of Life (1983) by Terry Jones and Terry Gilliam
The Meaning of Life (1983) by Terry Jones and Terry Gilliam

The scope of a variable is the place where it is visible, like a method, for example. Look at this Ruby class:

class CSV
  def initialize(csvFileName)
    @fileName = csvFileName
  end
  def readRecords()
    File.readLines(@fileName).map |csvLine|
      csvLine.split(',')
    end
  end
end

The visible scope of variable csvFileName is method initialize(), which is a constructor of the class CSV. Why does it need a compound name that consists of three words? Isn't it already clear that a single-argument constructor of class CSV expects the name of a file with comma-separated values? I would rename it to file.

Next, the scope of @fileName is the entire CSV class. Renaming a single variable in the class to just @file won't introduce any confusion. It's still clear what file we're dealing with. The same situation exists with the csvLine variable. It is clear that we're dealing with CSV lines here. The csv prefix is just a redundancy. Here is how I would refactor the class:

class CSV
  def initialize(file)
    @file = file
  end
  def records()
    File.readLines(@file).map |line|
      line.split(',')
    end
  end
end

Now it looks clear and concise.

If you can't perform such a refactoring, it means your scope is too big and/or too complex. An ideal method should deal with up to five variables, and an ideal class should encapsulate up to five properties.

If we have five variables, can't we find five nouns to name them?

Adam and Eve didn't have second names. They were unique in Eden, as were many other characters in the Old Testament. Second and middle names were invented later in order to resolve ambiguity. To keep your methods and classes clean and solid, and to prevent ambiguity, try to give your variables and methods unique single-word names, just like Adam and Eve were named by you know who :)

PS. Also, redundant variables are evil as well.

© Yegor Bugayenko 2014–2018

Continuous Integration on Windows, with Appveyor and Maven

QR code

Continuous Integration on Windows, with Appveyor and Maven

  • comments
badge

The purpose of Continuous Integration is to tell us, the developers, when the product we're working on is not "packagable" any more. The sooner we get the signal, the better. Why? Because the damage will be younger if we find it sooner. The younger the damage, the easier it is to fix. There are many modern and high-quality hosted continuous integration services, but only one of them (to my knowledge) supports Windows as a build platform---appveyor.com. My experience tells me that it's a good practice to continuously integrate on different platforms at the same time, especially when developing an open source library. That's why, in Zerocracy we're using AppVeyor in combination with Travis.

This is how I managed to configure AppVeyor to build my Java Maven projects (this is appveyor.yml configuration file you're supposed to place in the root directory of your GitHub repository):

version: '{build}'
os: Windows Server 2012
install:
  - ps: |
      Add-Type -AssemblyName System.IO.Compression.FileSystem
      if (!(Test-Path -Path "C:\maven" )) {
        (new-object System.Net.WebClient).DownloadFile(
          'http://www.us.apache.org/dist/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.zip',
          'C:\maven-bin.zip'
        )
        [System.IO.Compression.ZipFile]::ExtractToDirectory("C:\maven-bin.zip", "C:\maven")
      }
  - cmd: SET PATH=C:\maven\apache-maven-3.2.5\bin;%JAVA_HOME%\bin;%PATH%
  - cmd: SET MAVEN_OPTS=-XX:MaxPermSize=2g -Xmx4g
  - cmd: SET JAVA_OPTS=-XX:MaxPermSize=2g -Xmx4g
build_script:
  - mvn clean package --batch-mode -DskipTest
test_script:
  - mvn clean install --batch-mode
cache:
  - C:\maven\
  - C:\Users\appveyor\.m2

It was not that easy at all, so I decided to share. You can see how this configuration works in these projects: jcabi-aspects, jcabi-email, jcabi-dynamo, and rultor.

© Yegor Bugayenko 2014–2018

Daily Stand-Up Meetings Are a Good Tool for a Bad Manager

QR code

Daily Stand-Up Meetings Are a Good Tool for a Bad Manager

  • comments

A stand-up meeting (or simply "stand-up") is "a daily team-meeting held to provide a status update to the team members," according to Wikipedia. In the next few paragraphs, I attempt to explain why these meetings, despite being so popular in software development teams, are pure evil and should never be used by good managers.

I'm not saying they can be done right or wrong; there are plenty of articles about that. I'm not trying to give advice about how to do them properly so they work, either. I'm saying that a good manager should never have daily stand-ups. Because they not only "don't work" but also do very bad, sometimes catastrophic, things to your management process, whether it's agile or not. On the other hand, a bad manager will always use daily stand-ups as his or her key management instrument.

Cool Hand Luke (1967) by Stuart Rosenberg
Cool Hand Luke (1967) by Stuart Rosenberg

To explain what I mean, let's look at management from a few different angles and compare how good and bad managers would organize their work.

Information

A Bad Manager Asks How Things Are Going. Strolling around the office asking how things are going is a great habit of a terrible manager. He doesn't know what his team is doing because he is not smart enough to organize the process and information flow correctly. However, he needs to know what's going on because his boss is also asking him from time to time. So the only way to collect the required information is to ask the team, "What are you working on right now?" Morning stand-up is a perfect place to ask this annoying question officially without being marked as a manager who doesn't know what he is doing.

A Good Manager Is Being Told When Necessary. Managing a project involves management of communications. When information flows are organized correctly, every team member knows when and how he or she has to report to the manager. When something goes wrong, everybody knows how such a situation has to be reported: immediately and directly. When a backlog task is completed, everybody understands how to inform a project manager if he needs this information. A perfect project manager never asks his people. Instead, they tell him when necessary. And when someone does stop to tell him something, a good project manager fixes such a broken communication channel. But he never uses daily meetings to collect information.

As a good manager, inform your team what your goals are and what's important to you as a project manager (or Scrum master). They should know what's important for you to know about their progress, risks, impediments, and failures. They should understand what trouble you will get into if they let you down. It is your job, as a good manager, to inform them about the most important issues the project and the team are working through. It's their job, as a good team, to inform you immediately when they have some important information. This is what perfect management is about.

If you manage to organize teamwork like that, you won't need to wait until the next morning to ask your developers what they were doing yesterday and what problems they experienced. You would have seen this information earlier, exactly when you needed it. You would stay informed about your project affairs even outside of the office. Actually, you would not need an office at all, but that's a subject for another discussion :)

Someone may say that daily stand-ups are a perfect place and time to exchange information among programmers, not just to inform the Scrum master and get his feedback. Again, we have the same argument here---why can't they exchange information when it's required, during the day? Why do we need to put 10 people together every morning to discuss something that concerns only five of them? I can answer. Bad managers, who don't know how else to organize the exchange of information between team members, use morning stand-ups as a replacement for a correct communication model. These morning meetings give the impression that the manager is working hard and well deserves his overblown salary. To the contrary, a good manager would never have any regular status update meetings, because he knows how to use effective communication instruments, like issue tracking tools, emails, code reviews, decision-making meetings, pair programming, etc.

Responsibility

A Bad Manager Micro-Manages. This guy knows very little about project management, and that's why he feels very insecure. He is afraid of losing control of the team; he doesn't trust his own people; and he always feels under-informed and shakes when his own boss asks him, "What's going on?" Because of all this, he uses his people as anti-depressant pills---when they are doing what he says, he feels more secure and stable. A daily stand-up meeting is a great place where he can ask each of us what we're doing and then tell us what we should do instead. This manager forces us to disclose our personal goals and plans in order to correct them when he feels necessary. How many times have you heard something like this: "I'm planning to test X. ... No, next week; today you work with Y" This is micro-management. Daily stand-ups are the perfect tool for a micro-manager.

A Good Manager Delegates Responsibility. Ideal management involves four steps: 1) Breaking a complex task into smaller sub-tasks; 2) Delegating them to subordinates; 3) Declaring awards, penalties, and rules; and 4) Making sure that awards are generous, penalties are inevitable, and rules are strictly followed. A perfect manager never tells his people what to do every day and how to organize their work time. He trusts and controls. He never humiliates his people by telling them how to do their work. A great manager would say: "You're planning to test X today? It's your decision, and I fully respect it. Just remember that if Y isn't ready by the end of the week, you lose the project, as we agreed." Why would such a manager need daily stand-ups? Why would he need to ask his people what they are doing? He is not meddling in their plans. Instead, he trusts them and controls their results only.

Let me reiterate: I strongly believe that responsibility must be delegated, and this delegation consists of three components: awards, penalties, and rules. In a modern Western culture, it may be rather difficult to define them---we have long-term contracts and monthly salaries. But a good manager has to find a way. Each task has to be delegated and isolated. This means that the programmer working on the task has to be personally responsible for its success or failure. And he or she has to know the consequences.

A good manager understands that any team member inevitably tries to avoid personal responsibility. Everybody is trying to put a responsibility monkey back on the shoulders of the manager. It is natural and inevitable. And daily stand-up meetings only help everybody do this trick.

When you ask me in the morning how things are going, I'll say that there are some problems and I'm not sure that I will be able to finish the task by the end of the week. That's it! I'm not responsible for the task anymore. It's not my fault if I fail. I told you that I may fail, remember? From now, the responsibility is yours.

A good manager knows about this trick and prevents it by explicitly defining awards, penalties, and rules. When I tell you that I may fail, you remind me that I'm going to lose my awards and will get penalties instead:

- I'm not sure I can meet the deadline ...
- Sorry to hear that you're going to lose your
  $200 weekend bonus because of that :(

Have you seen many project managers or Scrum masters saying such a thing? Not so many, I believe. Yes, a good manager is a rare creature. But only a good manager is capable of defining awards, penalties, and rules so explicitly and strictly.

When this triangle is defined, nobody needs status update meetings every morning. Everything is clear as it is. We all know our goals and our objectives. We know what will happen if we fail, and we also understand how much we're going to get if we succeed. We don't need a manager to remind us about that every morning. And we don't need a manager to check our progress. He already gave us a very clear definition of our objectives. Why would we talk about them again every morning?

A bad manager isn't capable of defining objectives; that's why he wants to micro-manage us every morning. Actually, a bad manager is doing it during the day too. He is afraid that without well-known goals and rules, the team will do something wrong or won't do anything at all. That's why he has "to keep his hand on the pulse." In reality, he keeps his hand on the neck of the team.

Motivation

A Bad Manager De-Motivates by Public Embarrassment. He doesn't know how to organize a proper motivational system within the team; that's why he relies on a natural fear of public embarrassment. It's only logical that no one would feel comfortable saying, "I forgot it" in front of everybody. So the daily stand-up meeting is where he puts everybody in a line and asks, "What did you do yesterday?" This fearful moment is a great motivator for the team, isn't it? I don't think so.

A Good Manager Motivates by Objectives. Ideal management defines objectives and lets people achieve them using their skills, resources, knowledge, and passion. A properly defined objective always has three components: awards, penalties, and rules. A great manager knows how to translate corporate objectives into personal ones: "If we deliver this feature before the weekend, the company will generate extra profit. You, Sally, will personally get $500. If you fail, you will be moved to another, less interesting project." This is a perfectly defined objective. Do we need to ask Sally every morning, in front of everybody, if she forgot to implement the feature? If she is working hard? Will this questioning help her? Absolutely not! She already knows what she is working for, and she is motivated enough. When she finishes on time, organize a meeting and give her a $500 check in front of everybody. This is what a good manager uses meetings for.

There's more to this, too, as daily status updates in front of everybody motivate the best team players to backslide and become the same as the worst ones. Well, this is mostly because they don't want to offend anyone by their super performance. It is in our nature to try to look similar to everybody else while being in a group. When everybody reports, "I still have nothing to show," it would be strange to expect a good programmer to say, "I finished all my tasks and want to get more." Well, this may happen once, but after a few times, this A player will either stop working hard or will change the team. He will see that his performance is standing out and that this can't be appreciated by the group, no matter what the manager says.

A good manager understands that each programmer has his or her own speed, quality, and salary. A good manager gives different tasks to different people and expects different results from them. Obviously, lining everybody up in the morning and expecting similar reports from them is a huge mistake. The mistake will have a catastrophic effect on A players, who are interested in achieving super results and expect to be super-appreciated and compensated.

A bad manager can't manage different people differently, just because he doesn't know how. That's why he needs daily stand-ups, where everybody reports almost the same, and it's easy to compare their results to each other. Also, it's easier to blame or to cheer up those who don't report similar to others. In other words, a bad manager uses daily stand-ups as an instrument of equality, which in this case only ruins the entire team's motivation.


Daily stand-ups, as well as any status update meetings, are a great instrument to hide and protect a lazy and stupid manager. To hide his inability to manage people. To hide his lack of competence. To hide his fear of problems, challenges, and risks. If you're a good manager, don't embarrass yourself with daily stand-ups.

© Yegor Bugayenko 2014–2018

How to Be Honest and Keep a Customer

QR code

How to Be Honest and Keep a Customer

  • comments

Most of our clients are rather surprised when we explain to them that they will have full access to the source code from the first day of the project. We let them see everything that is happening in the project, including the Git repository, bug reports, discussions between programmers, continuous integration fails, etc. They often tell me that other software development outsourcing teams keep this information in-house and deliver only final releases, rarely together with the source code.

I understand why other developers are trying to hide as much as possible. Giving a project sponsor full access to the development environment is not easy at all. Here is a summary of problems we've been having and our solutions. I hope they help you honestly show your clients all project internals and still keep them on board.

99 francs (2007) by Jan Kounen
99 francs (2007) by Jan Kounen

He Is Breaking Our Process

This is the most popular problem we face with our new clients. Once they gain access to the development environment, they try to give instructions directly to programmers, walking around our existing process. "I'm paying these guys; why can't I tell them what to do?" is a very typical mindset. Instead of submitting requests through our standard change management mechanism, such a client goes directly to one of the programmers and tells him what should be fixed, how, and when. It's micro-management in its worst form. We see it very often. What do we do?

First, we try to understand why it's happening. The simplest answer is that the client is a moron. Sometimes this is exactly the case, but it's a rare one. Much more often, our clients are not that bad. What is it, then? Why can't they follow the process and abide by the rules? There are a few possible reasons.

Maybe the rules are not explained well. This is the most popular root cause---the rules of work are not clear enough for the client. He just doesn't know what he is supposed to do in order to submit a request and get it implemented. To prevent this, we try to educate our clients at the beginning of a new project. We even write guidance manuals for clients. Most of them are happy to read them and learn the way we work, because they understand that this is the best way to achieve success while working with us.

Maybe our management is chaotic, and the client is trying to "organize" us by giving explicit instructions regarding the most important tasks. We've seen it before, and we are always trying to learn from this. As soon as we see that the client is trying to micro-manage us, we ask ourselves: "Is our process transparent enough? Do we give enough information to the client about milestones, risks, plans, costs, etc.?" In most cases, it's our own fault, and we're trying to learn and improve. If so, it's important to react fast, before the client becomes too aggressive in his orders and instructions. It will be very difficult to escort him back to the normal process once he gets "micro-management" in his blood.

Maybe the client is not busy enough and has a lot of free time, which he is happy to spend by giving orders and distracting your team. I've seen this many times. A solution? Keep him busy. Turn him into a member of the team and assign him some tasks related to documentation and research. In my experience, most clients would be happy to do this work and help the project.

He Is Asking Too Much

A technically-savvy client can turn the life of an architect into a nightmare by constantly asking him to explain every single technical decision made, from "Why PostgreSQL instead of MySQL?" to "Why doesn't this method throw a checked exception?" Constantly answering such questions can turn a project into a school of programming. Even though he is paying for our time, that doesn't mean we should teach him how to develop software, right? On the other hand, he is interested in knowing how his software is developed and how it works. It's a fair request, isn't it?

I believe there is a win-win solution to this problem. Here is how we manage it. First of all, we make all his requests formal. We ask a client to create a new ticket for each request, properly explaining what is not clear and how much detail is expected in the explanation.

Second, we look at such requests positively---they are good indicators of certain inconsistencies in the software. If it's not clear for the client why PostgreSQL is used and not MySQL, it's a fault of our architect. He didn't document his decision and didn't explain how it was made, what other options were considered, what selection criteria were applied, etc. Thus, a request from a client is a bug we get for free. So, we look at it positively.

Finally, we charge our clients for the answers given. Every question, submitted as a ticket, goes through the full flow and gets billed just as any other ticket. This approach prevents the client from asking for too much. He realizes that we're ready to explain anything he wants, but he will pay for it.

He Is Telling Too Much

This problem is even bigger than the previous one. Some clients believe they are savvy enough to argue with our architect and our programmers about how the software should be developed. They don't just ask why PostgreSQL is used, they tell us that we should use MySQL, because "I know that it's a great database; my friend is using it, and his business is growing!" Sometimes it gets even worse, when suggestions are directed at every class or even a method, like "You should use a Singleton pattern here!"

Our first choice is to agree and do what he wants. But it's a road to nowhere. Once you do it, your project is ruined, and you should start thinking about a divorce with this client. Your entire team will quickly turn into a group of coding monkeys, micro-managed by someone with some cash. It's a very wrong direction; don't even think about going there.

The second choice is to tell the client to mind his own business and let us do ours. He hired us because we're professional enough to develop the software according to his requirements. If he questions our capabilities, he is free to change the contractor. But until then, he has to trust our decisions. Will this work? I doubt it. It's the same as giving him the finger. He will get offended, and you won't get anything.

The solution here is to turn the client's demands into project requirements. Most of them will be lost in the process, because they won't be sane enough to form a good requirement. Others will be documented, estimated, and crossed-out by the client himself, because he will realize they are pointless or too expensive. Only a few of them will survive, since they will be reasonable enough. And they will help the project. So it is also a win-win solution.

For example, he says that "you should use MySQL because it's great." You tell him that the project requirements document doesn't limit you to choose whichever database you like. Should it? He says yes, of course! OK, let's try to document such a requirement. How will it sound? How about, "We should only use great databases?" Sound correct? If so, then PostgreSQL satisfies this requirement. Problem solved; let us continue to do our work. He will have a hard time figuring out how to write a requirement in a way that disallows PostgreSQL but allows MySQL. It is simply not possible in most cases.

Sometimes, though, it will make sense; for example, "We should use a database server that understands our legacy data in MySQL format." This is a perfectly sane requirement, and the only way to satisfy it is to use MySQL.

Thus, my recommendation is to never take a client's demands directly to execution, but rather use them first to amend the requirements documentation. Even if you don't have such documentation, create a simple one-page document. Agree with the client that you work against this document, and when anyone wants to change something, you first have to amend the document and then have your team ensure the software satisfies it. This kind of discipline will be accepted by any client and will protect you against sudden and distracting corrections.

He Is Questioning Our Skills

When source code is open to the client, and he is technically capable of reading it, it is very possible that one day he will tell us that our code is crap and we have to learn how to program better. It has not happened in our projects for many years, but it has happened before, when we weren't using static analysis as a mandatory step in our continuous integration pipeline.

Another funny possibility is when the client shows the source code to a "friend," and he gives a "professional" opinion, which sounds like, "They don't know what they are doing." Once such an opinion hits your client's ears, the project is at a significant risk of closure. It'll be very difficult, almost impossible, to convince the client not to listen to the "friend" and continue to work with you. That's why most outsourcers prefer to keep their sources private until the very end of the project, when the final invoice is paid.

I think that an accidental appearance of a "friend" with a negative opinion is un-preventable. If it happens, it happens. You can't avoid it. On the other hand, if you think your code is perfect and your team has only talented programmers writing beautiful software, this is not going to protect you either. An opinion coming from a "friend" won't be objective; it will just be very personal, and that's why it's very credible. He is a friend of a client, and he doesn't send him bills every week. Why would he lie? Of course, he is speaking from the heart! (I'm being sarcastic.) So, no matter how beautiful your architecture and your source code is, the "friend" will always be right.

In my opinion, the only way to prevent such a situation or minimize its consequences is to organize regular and systematic independent technical reviews. They will give confidence to the client that the team is not lying to him about the quality of the product and key technical decisions made internally.


To conclude, I strongly believe it is important to be honest and open with each client, no matter how difficult it is. Try to learn from every conflict with each client, and improve your management process and your principles of work. Hiding source code is not professional and makes you look bad in the eyes of your clients and the entire industry.

© Yegor Bugayenko 2014–2018

Immutable Objects Are Not Dumb

QR code

Immutable Objects Are Not Dumb

  • comments

After a few recent posts about immutability, including Objects Should Be Immutable and How an Immutable Object Can Have State and Behavior?, I was surprised by the number of comments saying that I badly misunderstood the idea. Most of those comments stated that an immutable object must always behave the same way---that is what immutability is about. What kind of immutability is it, if a method returns different results each time we call it? This is not how well-known immutable classes behave. Take, for example, String, BigInteger, Locale, URI, URL, Inet4Address, UUID, or wrapper classes for primitives, like Double and Integer. Other comments argued against the very definition of an immutable object as a representative of a mutable real-world entity. How could an immutable object represent a mutable entity? Huh?

The Usual Suspects (1995) by Bryan Singer
The Usual Suspects (1995) by Bryan Singer

I'm very surprised. This post is going to clarify the definition of an immutable object. First, here is a quick answer. How can an immutable object represent a mutable entity? Look at an immutable class, File, and its methods, for example length() and delete(). The class is immutable, according to Oracle documentation, and its methods may return different values each time we call them. An object of class File, being perfectly immutable, represents a mutable real-world entity, a file on disk.

badge

In this post, I said that "an object is immutable if its state can't be modified after it is created." This definition is not mine; it's taken from Java Concurrency in Practice by Goetz et al., Section 3.4 (by the way, I highly recommend you read it). Now look at this class (I'm using jcabi-http to read and write over HTTP):

@Immutable
class Page {
  private final URI uri;
  Page(URI addr) {
    this.uri = addr;
  }
  public String load() {
    return new JdkRequest(this.uri)
      .fetch().body();
  }
  public void save(String content) {
    new JdkRequest(this.uri)
      .method("PUT")
      .body().set(content).back()
      .fetch();
  }
}

What is the "state" in this class? That's right, this.uri is the state. It uniquely identifies every object of this class, and it is not modifiable. Thus, the class makes only immutable objects. And each object represents a mutable entity of the real world, a web page with a URI.

There is no contradiction in this situation. The class is perfectly immutable, while the web page it represents is mutable.

Why do most programmers I have talked to believe that if an underlying entity is mutable, an object is mutable too? I think the answer is simple---they think that objects are data structures with methods. That's why, from this point of view, an immutable object is a data structure that never changes.

This is where the fallacy is coming from---an object is not a data structure. It is a living organism representing a real-world entity inside the object's living environment (a computer program). It does encapsulate some data, which helps to locate the entity in the real world. The encapsulated data is the coordinates of the entity being represented. In the case of String or URL, the coordinates are the same as the entity itself, but this is just an isolated incident, not a generic rule.

An immutable object is not a data structure that doesn't change, even though String, BigInteger, and URL look like one. An object is immutable if and only if it doesn't change the coordinates of the real-world entity it represents. In the Page class above, this means that an object of the class, once instantiated, will never change this.uri. It will always point to the same web page, no matter what.

And the object doesn't guarantee anything about the behavior of that web page. The page is a dynamic creature of a real world, living its own life. Our object can't promise anything about the page. The only thing it promises is that it will always stay loyal to that page---it will never forget or change its coordinates.

Conceptually speaking, immutability means loyalty, that's all.


If you like this article, you will definitely like these very relevant posts too:

Objects Should Be Immutable
The article gives arguments about why classes/objects in object-oriented programming have to be immutable, i.e. never modify their encapsulated state

How an Immutable Object Can Have State and Behavior?
Object state and behavior are two very different things, and confusing the two often leads to incorrect design.

Gradients of Immutability
There are a few levels and forms of immutability in object-oriented programming, all of which can be used when they seem appropriate.

© Yegor Bugayenko 2014–2018

You Do Need Independent Technical Reviews!

QR code

You Do Need Independent Technical Reviews!

  • comments

Do you have a team of brilliant and enthusiastic programmers? Of course! You've carefully chosen them from a hundred candidates! Are they passionate about the product? Absolutely! They use cutting-edge technologies, never sleep, and hardly eat or drink anything except coffee! Do they believe in your business success? No doubts about it; they live and breathe all those features, releases, continuous delivery, user experience, etc. Are you sure they are developing the product correctly? Well, yes, you're pretty sure; why wouldn't they? ...

Arizona Dream (1992) by Emir Kusturica
Arizona Dream (1992) by Emir Kusturica

Does this sound familiar? I can't count how many times I've heard these stories told by startup founders. Most of them are in love with their teams ... until that day when it's time to hire a new one. There could be many possible reasons for such a fiasco, but one of them is a lack of regular, systematic, and independent technical reviews. Nothing demotivates a development team more than a lack of attention to their deliverables. On the other hand, a regular reconciliation of their results and your quality expectations is one of the key factors that will guarantee technical success for your startup. Below I summarize my experience with organizing such technical reviews.

An independent review is when you ask someone outside of your team to look at your source code and other technical resources and give you an objective opinion about them. Every modern software team should also use internal code reviews, which is is something else entirely. An internal review occurs when one programmer shows his code to other peers on the team and asks their opinion. This usually happens as a daily activity and has nothing to do with independent reviews.

An independent review is performed by a programmer who knows nothing about your team. He comes on board, checks out the code from your repository, spends a few hours (or days) looking at it and trying to understand what it does. Then, he tells you what is wrong and where. He explains how he would do it better, where he would change it, and what he would do instead. Then, you pay him and he leaves. You may never see him again, but his conclusions and suggestions help you check the reality of your code and evaluate how your team is really doing.

badge

We, at Zerocracy, do independent reviews with every project of ours, and this is a list of principles we use:

Make Independent Reviews Systematic. This is the first and most important rule---organize such technical reviews regularly. Moreover, inform your team about the schedule, and let them be prepared for the reviews. Once a month is a good practice, according to my experience. Depending on your source code size, a full review should take from two to eight hours. Don't spend more than eight hours; there is no point in going too deep into the code during independent reviews.

Pay for Bugs Found. We always pay for bugs, not for the time spent finding them. We ask our reviewers to look at the code and report as many bugs as we think we need. For each bug, we pay 15 minutes for their time. In other words, we assume that a good reviewer can find and report approximately four problems in one hour. For example, a reviewer charges $150 per hour. We hire him and ask him to find and report the 20 most critical issues he can discover. Our estimate is that he should spend five hours on this work. Thus, he will get $750 when we have 20 bugs in our tracking system reported by him. If he finds fewer, he gets proportionally less money. This payment schedule will help you focus your reviewer on the main objective of the review process---finding and reporting issues. There are no other goals. The only thing you're interested in is knowing what the issues with your current technical solution are. That's what you're paying for.

Hire the Best and Pay Well. My experience tells me that the position of an independent reviewer is a very important one. He is not just a programmer but more of an architect who is capable of looking at the solution from a very high level of abstraction, while at the same time paying a lot of attention to details; he should be very good at designing similar systems; he should know how to report a bug correctly and with enough detail; he should understand your business domain; etc. Besides all that, he should be well motivated to help you. You're not hiring him for full-time work but rather just for a few-hour gig. My advice is to try to get the best guys, and pay them as much as they ask, usually over $100 per hour. Don't negotiate, just pay. It's just a few hundred dollars for you, but the effect of their contribution will be huge.

Ask For and Expect Criticism. It is a very common mistake to ask a reviewer, "Do you like our code?" Don't expect him to tell you how great your code is. This is not what you're paying him for! You already have a full team of programmers for cheering you up; they can tell you a lot about the code they are creating and how awesome it is. You don't want to hear this again from the reviewer. Instead, you want to know what is wrong and needs to be fixed. So your questions should sound like, "What problems do you think we should fix first?" Some reviewers will try to please you with positive comments, but ignore that flattery and bring them back to the main goal---bugs. The payment schedule explained above should help.

Regularly Change Reviewers. Try not to use the same reviewer more than once on the same project (I mean the same code base). I believe the reason here is obvious, but let me re-iterate: You don't need your reviewer to be nice to you and tell you how great your code is. You want him to be objective and focused on problems, not on bright sides. If you hire the same person again and again, psychologically you make him engaged to the source code. He's seen it once; now he has to see it again. He already told you about some problem, and now he has to repeat it again. He won't feel comfortable doing it. Instead, he will start feeling like a member of the team and will feel responsible for the source code and its mistakes. He, as any other team member, will start hiding issues instead of revealing them. Thus, for every independent technical review, get a new person.

Be Polite and Honest With Your Team. Independent reviews can be rather offensive to your programmers. They may think that you don't trust them. They may feel that you don't respect them as technical specialists. They may even decide that you're getting ready to fire them all and are currently looking for new people. This is a very possible and very destructive side effect of an independent review. How do you avoid it? I can't give you universal advice, but the best suggestion I can give is this: be honest with them. Tell them that the quality of the product is critical for you and your business. Explain to them that the business is paying them for their work and that in order to keep paychecks coming, you have to stress quality control---regularly, objectively, independently, and honestly. In the end, if you manage to organize reviews as this article explains, the team will be very thankful to you. They will gain a lot of new ideas and thoughts from every review and will ask you to repeat them regularly.

Review From Day One. Don't wait until the end of the project! I've seen this mistake many times. Very often startup founders think that until the product is done and ready for the market, they shouldn't distract their team. They think they should let the team work toward project milestones and take care of quality later, "when we have a million visitors per day." This day will never come if you let your team run without control! Start conducting independent reviews from the moment your Git repository has its first file. Until the repository is big enough, you may only spend $300 once a month to receive an objective, independent opinion about its quality. Will this ruin your budget?

Prohibit Discussions, and Ask for Formal Reporting. Don't let your reviewers talk to the team. If you do, the entire idea of a review being independent falls apart. If a reviewer is able to ask informal questions and discuss your system design with your programmers, their answers will satisfy him, and he will move on. But you, the owner of the business, will stay exactly where you were before the review. The point of the review is not to make the reviewer happy. It is exactly the opposite. You want to make him confused! If he is confused, your design is wrong and he feels the need to report a bug. The source code should speak for itself, and it should be easy enough for a stranger (the reviewer) to understand how it works. If this is not the case, there is something wrong that should be fixed.

Treat Any Question as a Bug. Don't expect a review to produce any bugs in functionality, like "I click this button and the system crashes." This will happen rarely, if ever. Your team is very good at discovering these issues and fixing them. Independent reviews are not about that kind of bugs. The main goal of an independent review is to discover bugs in the architecture and design. Your product may work, but its architecture may have serious design flaws that won't allow you, for example, to handle exponential growth in web traffic. An independent reviewer will help you find those flaws and address them sooner than later. In order to get bugs of that kind from the reviewer, you should encourage him to report anything he doesn't like---unmotivated use of a technology, lack of documentation, unclear purpose of a file, absence of a unit test, etc. Remember, the reviewer is not a member of your team and has his own ideas about the technologies you're using and software development in general. You're interested in matching his vision with your team's. Then, you're interested in fixing all critical mismatches.

Review Everything, Not Just Source Code. Let your reviewer look at all technical resources you have, not just source code files (.java, .rb, .php, etc.) Give him access to the database schema, continuous integration panel, build environment, issue tracking system, plans and schedules, work agendas, up-time reports, deployment pipeline, production logs, customer bug reports, statistics, etc. Everything that could help him understand how your system works, and more importantly, where and how it breaks, is very useful. Don't limit the reviewer to the source code only---this is simply not enough! Let him see the big picture, and you will get a much more detailed and professional report.

Track How Inconsistencies Are Resolved. Once you get a report from the reviewer, make sure that the most important issues immediately get into your team's backlog. Then, make sure they are addressed and closed. That doesn't mean you should fix them all and listen to everything said by the reviewer. Definitely not! Your architect runs the show, not the reviewer. Your architect should decide what is right and what is wrong in the technical implementation of the product. But it's important to make him resolve all concerns raised by the reviewer. Very often you will get answers like these from him: "We don't care about it now," "we won't fix it until the next release," or "he is wrong; we're doing it better." These answers are perfectly valid, but they have to be given (reviewers are people and they also make mistakes). The answers will help you, the founder, understand what your team is doing and how well they understand their business.


If you can offer more suggestions, based on your experience, please post them below in the comments, and I'll add them to the list. I'm still thinking that I may have forgotten something important :)

© Yegor Bugayenko 2014–2018

How Much Your Objects Encapsulate?

QR code

How Much Your Objects Encapsulate?

  • comments

Which line do you like more, the first or the second:

new HTTP("http://www.google.com").read();
new HTTP().read("http://www.google.com");

What is the difference? The first class HTTP encapsulates a URL, while the second one expects it as an argument of method read(). Technically, both objects do exactly the same thing: they read the content of the Google home page. Which one is the right design? Usually I hate to say this, but in this case I have to---it depends.

The Truman Show (1998) by Peter Weir
The Truman Show (1998) by Peter Weir

As we discussed before, a good object is a representative of a real-life entity. Such an entity exists outside of the object's living environment. The object knows how to access it and how to communicate with it.

What is that real-life entity in the example above? Each class gives its own answer. And the answer is given by the list of arguments its constructors accept. The first class accepts a single URL as an argument of its constructor. This tells us that the object of this class, after being constructed, will represent a web page. The second class accepts no arguments, which tells us that the object of it will represent... the Universe.

I think this principle is applicable to all classes in object-oriented programming---in order to understand what real-life entity an object represents, look at its constructor. All arguments passed into the constructor and encapsulated by the object identify a real-life entity accessed and managed by the object.

Of course, I'm talking about good objects, which are immutable and don't have setters and getters.

Pay attention that I'm talking about arguments encapsulated by the object. The following class doesn't represent the Universe, even though it does have a no-arguments constructor:

class Time {
  private final long msec;
  public Time() {
    this(System.currentTimeMillis());
  }
  public Time(long time) {
    this.msec = time;
  }
}

This class has two constructors. One of them is the main one, and one is supplementary. We're interested in the main one, which implements the encapsulation of arguments.

Now, the question is which is better: to represent a web page or the Universe? It depends, but I think that in general, the smaller the real-life entity we represent, the more solid and cohesive design we give to the object.

On the other hand, sometimes we have to have an object that represents the Universe. For example, we may have this:

class HTTP {
  public String read(String url) {
    // read via HTTP and return
  }
  public boolean online() {
    // check whether we're online
  }
}

This is not an elegant design, but it demonstrates when it may be necessary to represent the entire Universe. An object of this HTTP class can read any web page from the entire web (it is almost as big as the Universe, isn't it?), and it can check whether the entire web is accessible by it. Obviously, in this case, we don't need it to encapsulate anything.

I believe that objects representing the Universe are not good objects, mostly because there is only one Universe; why do we need many representatives of it? :)

© Yegor Bugayenko 2014–2018

How an Immutable Object Can Have State and Behavior?

QR code

How an Immutable Object Can Have State and Behavior?

  • comments

I often hear this argument against immutable objects: "Yes, they are useful when the state doesn't change. However, in our case, we deal with frequently changing objects. We simply can't afford to create a new document every time we just need to change its title." Here is where I disagree: object title is not a state of a document, if you need to change it frequently. Instead, it is a document's behavior. A document can and must be immutable, if it is a good object, even when its title is changed frequently. Let me explain how.

Once Upon a Time in the West (1968) by Sergio Leone
Once Upon a Time in the West (1968) by Sergio Leone

Identity, State, and Behavior

Basically, there are three elements in every object: identity, state, and behavior. Identity is what distinguishes our document from other objects, state is what a document knows about itself (a.k.a. "encapsulated knowledge"), and behavior is what a document can do for us on request. For example, this is a mutable document:

class Document {
  private int id;
  private String title;
  Document(int id) {
    this.id = id;
  }
  public String getTitle() {
    return this.title;
  }
  public String setTitle(String text) {
    this.title = text;
  }
  @Override
  public String toString() {
    return String.format("doc #%d about '%s'", this.id, this.text);
  }
}

Let's try to use this mutable object:

Document first = new Document(50);
first.setTitle("How to grill a sandwich");
Document second = new Document(50);
second.setTitle("How to grill a sandwich");
if (first.equals(second)) { // FALSE
  System.out.println(
    String.format("%s is equal to %s", first, second)
  );
}

Here, we're creating two objects and then modifying their encapsulated states. Obviously, first.equals(second) will return false because the two objects have different identities, even though they encapsulate the same state.

Method toString() exposes the document's behavior---the document can convert itself to a string.

In order to modify a document's title, we just call its setTitle() once again:

first.setTitle("How to cook pasta");

Simply put, we can reuse the object many times, modifying its internal state. It is fast and convenient, isn't it? Fast, yes. Convenient, not really. Read on.

Immutable Objects Have No Identity

As I've mentioned before, immutability is one of the virtues of a good object, and a very important one. A good object is immutable, and good software contains only immutable objects. The main difference between immutable and mutable objects is that an immutable one doesn't have an identity and its state never changes. Here is an immutable variant of the same document:

@Immutable
class Document {
  private final int id;
  private final String title;
  Document(int id, String text) {
    this.id = id;
    this.title = text;
  }
  public String title() {
    return this.title;
  }
  public Document title(String text) {
    return new Document(this.id, text);
  }
  @Override
  public boolean equals(Object doc) {
    return doc instanceof Document
      && Document.class.cast(doc).id == this.id
      && Document.class.cast(doc).title.equals(this.title);
  }
  @Override
  public String toString() {
    return String.format(
      "doc #%d about '%s'", this.id, this.text
    );
  }
}

This document is immutable, and its state (id ad title) is its identity. Let's see how we can use this immutable class (by the way, I'm using @Immutable annotation from jcabi-aspects):

Document first = new Document(50, "How to grill a sandwich");
Document second = new Document(50, "How to grill a sandwich");
if (first.equals(second)) { // TRUE
  System.out.println(
    String.format("%s is equal to %s", first, second)
  );
}

We can't modify a document any more. When we need to change the title, we have to create a new document:

Document first = new Document(50, "How to grill a sandwich");
first = first.title("How to cook pasta");

Every time we want to modify its encapsulated state, we have to modify its identity too, because there is no identity. State is the identity. Look at the code of the equals() method above---it compares documents by their IDs and titles. Now ID+title of a document is its identity!

What About Frequent Changes?

Now I'm getting to the question we started with: What about performance and convenience? We don't want to change the entire document every time we have to modify its title. If the document is big enough, that would be a huge obligation. Moreover, if an immutable object encapsulates other immutable objects, we have to change the entire hierarchy when modifying even a single string in one of them.

The answer is simple. A document's title should not be part of its state. Instead, the title should be its behavior. For example, consider this:

@Immutable
class Document {
  private final int id;
  Document(int id) {
    this.id = id;
  }
  public String title() {
    // read title from storage
  }
  public void title(String text) {
    // save text to storage
  }
  @Override
  public boolean equals(Object doc) {
    return doc instanceof Document
      && Document.class.cast(doc).id == this.id;
  }
  @Override
  public String toString() {
    return String.format("doc #%d about '%s'", this.id, this.title());
  }
}

Conceptually speaking, this document is acting as a proxy of a real-life document that has a title stored somewhere---in a file, for example. This is what a good object should do---be a proxy of a real-life entity. The document exposes two features: reading the title and saving the title. Here is how its interface would look like:

@Immutable
interface Document {
  String title();
  void title(String text);
}

title() reads the title of the document and returns it as a String, and title(String) saves it back into the document. Imagine a real paper document with a title. You ask an object to read that title from the paper or to erase an existing one and write new text over it. This paper is a "copy" utilized in these methods.

Now we can make frequent changes to the immutable document, and the document stays the same. It doesn't stop being immutable, since it's state (id) is not changed. It is the same document, even though we change its title, because the title is not a state of the document. It is something in the real world, outside of the document. The document is just a proxy between us and that "something." Reading and writing the title are behaviors of the document, not its state.

Mutable Memory

The only question we still have unanswered is what is that "copy" and what happens if we need to keep the title of the document in memory?

Let's look at it from an "object thinking" point of view. We have a document object, which is supposed to represent a real-life entity in an object-oriented world. If such an entity is a file, we can easily implement title() methods. If such an entity is an Amazon S3 object, we also implement title reading and writing methods easily, keeping the object immutable. If such an entity is an HTTP page, we have no issues in the implementation of title reading or writing, keeping the object immutable. We have no issues as long as a real-world document exists and has its own identity. Our title reading and writing methods will communicate with that real-world document and extract or update its title.

Problems arise when such an entity doesn't exist in a real world. In that case, we need to create a mutable object property called title, read it via title(), and modify it via title(String). But an object is immutable, so we can't have a mutable property in it---by definition! What do we do?

Think.

How could it be that our object doesn't represent a real-world entity? Remember, the real world is everything around the living environment of an object. Is it possible that an object doesn't represent anyone and acts on its own? No, it's not possible. Every object is a representative of a real-world entity. So, who does it represent if we want to keep title inside it and we don't have any file or HTTP page behind the object?

badge

It represents computer memory.

The title of immutable document #50, "How to grill a sandwich," is stored in the memory, taking up 23 bytes of space. The document should know where those bytes are stored, and it should be able to read them and replace them with something else. Those 23 bytes are the real-world entity that the object represents. The bytes have nothing to do with the state of the object. They are a mutable real-world entity, similar to a file, HTTP page, or an Amazon S3 object.

Unfortunately, Java (and many other modern languages) do not allow direct access to computer memory. This is how we would design our class if such direct access was possible:

@Immutable
class Document {
  private final int id;
  private final Memory memory;
  Document(int id) {
    this.id = id;
    this.memory = new Memory();
  }
  public String title() {
    return new String(this.memory.read());
  }
  public void title(String text) {
    this.memory.write(text.getBytes());
  }
}

That Memory class would be implemented by JDK natively, and all other classes would be immutable. The class Memory would have direct access to the memory heap and would be responsible for malloc and free operations on the operating system level. Having such a class would allow us to make all Java classes immutable, including StringBuffer, ByteArrayOutputStream, etc.

The Memory class would explicitly emphasize the mission of an object in a software program, which is to be a data animator. An object is not holding data; it is animating it. The data exists somewhere, and it is anemic, static, motionless, stationary, etc. The data is dead while the object is alive. The role of an object is to make a piece of data alive, to animate it but not to become a piece of data. An object needs some knowledge in order to gain access to that dead piece of data. An object may need a database unique key, an HTTP address, a file name, or a memory address in order to find the data and animate it. But an object should never think of itself as data.

What Is the Practical Solution?

Unfortunately, we don't have such a memory-representing class in Java, Ruby, JavaScript, Python, PHP, and many other high-level languages. It looks like language designers didn't get the idea of alive objects vs. dead data, which is sad. We're forced to mix data with object states using the same language constructs: object variables and properties. Maybe someday we'll have that Memory class in Java and other languages, but until then, we have a few options.

Use C++. In C++ and similar low-level languages, it is possible to access memory directly and deal with in-memory data the same way we deal with in-file or in-HTTP data. In C++, we can create that Memory class and use it exactly the way we explained above.

Use Arrays. In Java, an array is a data structure with a unique property---it can be modified while being declared as final. You can use an array of bytes as a mutable data structure inside an immutable object. It's a surrogate solution that conceptually resembles the Memory class but is much more primitive.

Avoid In-Memory Data. Try to avoid in-memory data as much as possible. In some domains, it is easy to do; for example, in web apps, file processing, I/O adapters, etc. However, in other domains, it is much easier said than done. For example, in games, data manipulation algorithms, and GUI, most of the objects animate in-memory data mostly because memory is the only resource they have. In that case, without the Memory class, you end up with mutable objects :( There is no workaround.

To summarize, don't forget that an object is an animator of data. It is using its encapsulated knowledge in order to reach the data. No matter where the data is stored---in a file, in HTTP, or in memory---it is conceptually very different from an object state, even though they may look very similar.

A good object is an immutable animator of mutable data. Even though it is immutable and data is mutable, it is alive and data is dead in the scope of the object's living environment.


If you like this article, you will definitely like these very relevant posts too:

Objects Should Be Immutable
The article gives arguments about why classes/objects in object-oriented programming have to be immutable, i.e. never modify their encapsulated state

Gradients of Immutability
There are a few levels and forms of immutability in object-oriented programming, all of which can be used when they seem appropriate.

Immutable Objects Are Not Dumb
Immutable objects are not the same as passive data structures without setters, despite a very common mis-belief.

© Yegor Bugayenko 2014–2018

Synchronization Between Nodes

QR code

Synchronization Between Nodes

  • comments

When two or more software modules are accessing the same resource, they have to be synchronized. This means that only one module at a time should be working with the resource. Without such synchronization, there will be collisions and conflicts. This is especially true when we're talking about "resources" that do not support atomic transactions.

badge

To solve this issue and prevent conflicts, we have to introduce one more element into the picture. All software modules, before accessing the resource, should capture a lock from a centralized server. Once the manipulations with the resource are complete, the module should release the lock. While the lock is being captured by one module, no other modules will be able to capture it. The approach is very simple and well-known. However, I didn't find any cloud services that would provide such a locking and unlocking service over a RESTful API. So I decided to create one---stateful.co.

No Retreat, No Surrender (1986) by Corey Yuen
No Retreat, No Surrender (1986) by Corey Yuen

Here is a practical example. I have a Java web app that is hosted at Heroku. There are three servers (a.k.a. "dynos") running the same .war application. Why three? Because the web traffic is rather active, and one server is not powerful enough. So I have to have three of them. They all run exactly the same applications.

Each web app works with a table in Amazon DynamoDB. It updates the table, puts new items into it, deletes some items sometimes, and selects them. So far, so good, but conflicts are inevitable. Here is an example of a typical interaction scenario between the web app and DynamoDB (I'm using jcabi-dynamo):

Table table = region.table("posts");
Item item = table.frame()
  .where("name", "Jeff")
  .iterator().next();
String salary = item.get("salary");
item.put("salary", this.recalculate(salary));

The logic is obvious here. First, I retrieve an item from the table posts, then read its salary, and then modify it according to my recalculation algorithm. The problem is that another module may start to do the same while I'm recalculating. It will read the same initial value from the table and will start exactly the same recalculation. Then it will save a new value, and I will save one too. We will end up having Jeff's salary modified only once, while users will expect a double modification since two of them initiated two transactions with two different web apps.

The right approach here is to "lock" the DynamoDB table first, even before reading the salary. Then do the modifications and eventually unlock it. Here is how stateful.co helps me. All I need to do is create a new named lock in the stateful.co web panel, get my authentication keys, and modify my Java code:

Sttc sttc = new RtSttc(
  new URN("urn:github:526301"), // my GitHub ID
  "9FF3-4320-73FB-EEAC" // my secret key!
);
Locks locks = sttc.locks();
Lock lock = locks.get("posts-table-lock");
Table table = region.table("posts");
Item item = table.frame()
  .where("name", "Jeff")
  .iterator().next();
new Atomic(lock).call(
  new Callable<Void>() {
    @Override
    public void call() {
      String salary = item.get("salary");
      item.put("salary", this.recalculate(salary));
      return null;
    }
  }
);

As you see, I wrap that critical transaction into Callable, which will be executed in isolation. This approach, obviously, doesn't guarantee atomicity of transaction---if part of the transaction fails, there won't be any automatic rollbacks and the DynamoDB table will be left in a "broken" state.

Locks from stateful.co guarantee isolation in resource usage, and you can use any type of resources, including NoSQL tables, files, S3 objects, embedded software, etc.

I should not forget to add this dependency to my pom.xml:

<dependency>
  <groupId>co.stateful</groupId>
  <artifactId>java-sdk</artifactId>
</dependency>

Of course, you can do the same; the service is absolutely free of charge. And you can use any other languages, not just Java. BTW, if interested, contribute with your own SDK in your preferred language; I'll add it to the GitHub collection.

© Yegor Bugayenko 2014–2018

ORM Is an Offensive Anti-Pattern

QR code

ORM Is an Offensive Anti-Pattern

  • comments

TL;DR ORM is a terrible anti-pattern that violates all principles of object-oriented programming, tearing objects apart and turning them into dumb and passive data bags. There is no excuse for ORM existence in any application, be it a small web app or an enterprise-size system with thousands of tables and CRUD manipulations on them. What is the alternative? SQL-speaking objects.

Vinni-Pukh (1969) by Fyodor Khitruk
Vinni-Pukh (1969) by Fyodor Khitruk

How ORM Works

Object-relational mapping (ORM) is a technique (a.k.a. design pattern) of accessing a relational database from an object-oriented language (Java, for example). There are multiple implementations of ORM in almost every language; for example: Hibernate for Java, ActiveRecord for Ruby on Rails, Doctrine for PHP, and SQLAlchemy for Python. In Java, the ORM design is even standardized as JPA.

First, let's see how ORM works, by example. Let's use Java, PostgreSQL, and Hibernate. Let's say we have a single table in the database, called post:

+-----+------------+--------------------------+
| id  | date       | title                    |
+-----+------------+--------------------------+
|   9 | 10/24/2014 | How to cook a sandwich   |
|  13 | 11/03/2014 | My favorite movies       |
|  27 | 11/17/2014 | How much I love my job   |
+-----+------------+--------------------------+

Now we want to CRUD-manipulate this table from our Java app (CRUD stands for create, read, update, and delete). First, we should create a Post class (I'm sorry it's so long, but that's the best I can do):

@Entity
@Table(name = "post")
public class Post {
  private int id;
  private Date date;
  private String title;
  @Id
  @GeneratedValue
  public int getId() {
    return this.id;
  }
  @Temporal(TemporalType.TIMESTAMP)
  public Date getDate() {
    return this.date;
  }
  public Title getTitle() {
    return this.title;
  }
  public void setDate(Date when) {
    this.date = when;
  }
  public void setTitle(String txt) {
    this.title = txt;
  }
}

Before any operation with Hibernate, we have to create a session factory:

SessionFactory factory = new AnnotationConfiguration()
  .configure()
  .addAnnotatedClass(Post.class)
  .buildSessionFactory();

This factory will give us "sessions" every time we want to manipulate with Post objects. Every manipulation with the session should be wrapped in this code block:

Session session = factory.openSession();
try {
  Transaction txn = session.beginTransaction();
  // your manipulations with the ORM, see below
  txn.commit();
} catch (HibernateException ex) {
  txn.rollback();
} finally {
  session.close();
}

When the session is ready, here is how we get a list of all posts from that database table:

List posts = session.createQuery("FROM Post").list();
for (Post post : (List<Post>) posts){
  System.out.println("Title: " + post.getTitle());
}

I think it's clear what's going on here. Hibernate is a big, powerful engine that makes a connection to the database, executes necessary SQL SELECT requests, and retrieves the data. Then it makes instances of class Post and stuffs them with the data. When the object comes to us, it is filled with data, and we should use getters to take them out, like we're using getTitle() above.

When we want to do a reverse operation and send an object to the database, we do all of the same but in reverse order. We make an instance of class Post, stuff it with the data, and ask Hibernate to save it:

Post post = new Post();
post.setDate(new Date());
post.setTitle("How to cook an omelette");
session.save(post);

This is how almost every ORM works. The basic principle is always the same---ORM objects are anemic envelopes with data. We are talking with the ORM framework, and the framework is talking to the database. Objects only help us send our requests to the ORM framework and understand its response. Besides getters and setters, objects have no other methods. They don't even know which database they came from.

This is how object-relational mapping works.

What's wrong with it, you may ask? Everything!

What's Wrong With ORM?

Seriously, what is wrong? Hibernate has been one of the most popular Java libraries for more than 10 years already. Almost every SQL-intensive application in the world is using it. Each Java tutorial would mention Hibernate (or maybe some other ORM like TopLink or OpenJPA) for a database-connected application. It's a standard de-facto and still I'm saying that it's wrong? Yes.

I'm claiming that the entire idea behind ORM is wrong. Its invention was maybe the second big mistake in OOP after NULL reference.

Actually, I'm not the only one saying something like this, and definitely not the first. A lot about this subject has already been published by very respected authors, including OrmHate by Martin Fowler (not against ORM, but worth mentioning anyway), Object-Relational Mapping Is the Vietnam of Computer Science by Jeff Atwood, The Vietnam of Computer Science by Ted Neward, ORM Is an Anti-Pattern by Laurie Voss, and many others.

However, my argument is different than what they're saying. Even though their reasons are practical and valid, like "ORM is slow" or "database upgrades are hard," they miss the main point. You can see a very good, practical answer to these practical arguments given by Bozhidar Bozhanov in his ORM Haters Don’t Get It blog post.

badge

The main point is that ORM, instead of encapsulating database interaction inside an object, extracts it away, literally tearing a solid and cohesive living organism apart. One part of the object keeps the data while another one, implemented inside the ORM engine (session factory), knows how to deal with this data and transfers it to the relational database. Look at this picture; it illustrates what ORM is doing.

I, being a reader of posts, have to deal with two components: 1) the ORM and 2) the "ob-truncated" object returned to me. The behavior I'm interacting with is supposed to be provided through a single entry point, which is an object in OOP. In the case of ORM, I'm getting this behavior via two entry points---the ORM engine and the "thing," which we can't even call an object.

Because of this terrible and offensive violation of the object-oriented paradigm, we have a lot of practical issues already mentioned in respected publications. I can only add a few more.

SQL Is Not Hidden. Users of ORM should speak SQL (or its dialect, like HQL). See the example above; we're calling session.createQuery("FROM Post") in order to get all posts. Even though it's not SQL, it is very similar to it. Thus, the relational model is not encapsulated inside objects. Instead, it is exposed to the entire application. Everybody, with each object, inevitably has to deal with a relational model in order to get or save something. Thus, ORM doesn't hide and wrap the SQL but pollutes the entire application with it.

Difficult to Test. When some object is working with a list of posts, it needs to deal with an instance of SessionFactory. How can we mock this dependency? We have to create a mock of it? How complex is this task? Look at the code above, and you will realize how verbose and cumbersome that unit test will be. Instead, we can write integration tests and connect the entire application to a test version of PostgreSQL. In that case, there is no need to mock SessionFactory, but such tests will be rather slow, and even more important, our having-nothing-to-do-with-the-database objects will be tested against the database instance. A terrible design.

Again, let me reiterate. Practical problems of ORM are just consequences. The fundamental drawback is that ORM tears objects apart, terribly and offensively violating the very idea of what an object is.

SQL-Speaking Objects

badge

What is the alternative? Let me show it to you by example. Let's try to design that class, Post, my way. We'll have to break it down into two classes: Post and Posts, singular and plural. I already mentioned in one of my previous articles that a good object is always an abstraction of a real-life entity. Here is how this principle works in practice. We have two entities: database table and table row. That's why we'll make two classes; Posts will represent the table, and Post will represent the row.

As I also mentioned in that article, every object should work by contract and implement an interface. Let's start our design with two interfaces. Of course, our objects will be immutable. Here is how Posts would look:

interface Posts {
  Iterable<Post> iterate();
  Post add(Date date, String title);
}

This is how a single Post would look:

interface Post {
  int id();
  Date date();
  String title();
}

Here is how we will list all posts in the database table:

Posts posts = // we'll discuss this right now
for (Post post : posts.iterate()){
  System.out.println("Title: " + post.title());
}

Here is how we will create a new post:

Posts posts = // we'll discuss this right now
posts.add(new Date(), "How to cook an omelette");

As you see, we have true objects now. They are in charge of all operations, and they perfectly hide their implementation details. There are no transactions, sessions, or factories. We don't even know whether these objects are actually talking to the PostgreSQL or if they keep all the data in text files. All we need from Posts is an ability to list all posts for us and to create a new one. Implementation details are perfectly hidden inside. Now let's see how we can implement these two classes.

I'm going to use jcabi-jdbc as a JDBC wrapper, but you can use something else like jOOQ, or just plain JDBC if you like. It doesn't really matter. What matters is that your database interactions are hidden inside objects. Let's start with Posts and implement it in class PgPosts ("pg" stands for PostgreSQL):

final class PgPosts implements Posts {
  private final Source dbase;
  public PgPosts(DataSource data) {
    this.dbase = data;
  }
  public Iterable<Post> iterate() {
    return new JdbcSession(this.dbase)
      .sql("SELECT id FROM post")
      .select(
        new ListOutcome<Post>(
          new ListOutcome.Mapping<Post>() {
            @Override
            public Post map(final ResultSet rset) {
              return new PgPost(
                this.dbase,
                rset.getInt(1)
              );
            }
          }
        )
      );
  }
  public Post add(Date date, String title) {
    return new PgPost(
      this.dbase,
      new JdbcSession(this.dbase)
        .sql("INSERT INTO post (date, title) VALUES (?, ?)")
        .set(new Utc(date))
        .set(title)
        .insert(new SingleOutcome<Integer>(Integer.class))
    );
  }
}

Next, let's implement the Post interface in class PgPost:

final class PgPost implements Post {
  private final Source dbase;
  private final int number;
  public PgPost(DataSource data, int id) {
    this.dbase = data;
    this.number = id;
  }
  public int id() {
    return this.number;
  }
  public Date date() {
    return new JdbcSession(this.dbase)
      .sql("SELECT date FROM post WHERE id = ?")
      .set(this.number)
      .select(new SingleOutcome<Utc>(Utc.class));
  }
  public String title() {
    return new JdbcSession(this.dbase)
      .sql("SELECT title FROM post WHERE id = ?")
      .set(this.number)
      .select(new SingleOutcome<String>(String.class));
  }
}

This is how a full database interaction scenario would look like using the classes we just created:

Posts posts = new PgPosts(dbase);
for (Post post : posts.iterate()){
  System.out.println("Title: " + post.title());
}
Post post = posts.add(
  new Date(), "How to cook an omelette"
);
System.out.println("Just added post #" + post.id());

You can see a full practical example here. It's an open source web app that works with PostgreSQL using the exact approach explained above---SQL-speaking objects.

What About Performance?

I can hear you screaming, "What about performance?" In that script a few lines above, we're making many redundant round trips to the database. First, we retrieve post IDs with SELECT id and then, in order to get their titles, we make an extra SELECT title call for each post. This is inefficient, or simply put, too slow.

No worries; this is object-oriented programming, which means it is flexible! Let's create a decorator of PgPost that will accept all data in its constructor and cache it internally, forever:

final class ConstPost implements Post {
  private final Post origin;
  private final Date dte;
  private final String ttl;
  public ConstPost(Post post, Date date, String title) {
    this.origin = post;
    this.dte = date;
    this.ttl = title;
  }
  public int id() {
    return this.origin.id();
  }
  public Date date() {
    return this.dte;
  }
  public String title() {
    return this.ttl;
  }
}

Pay attention: This decorator doesn't know anything about PostgreSQL or JDBC. It just decorates an object of type Post and pre-caches the date and title. As usual, this decorator is also immutable.

Now let's create another implementation of Posts that will return the "constant" objects:

final class ConstPgPosts implements Posts {
  // ...
  public Iterable<Post> iterate() {
    return new JdbcSession(this.dbase)
      .sql("SELECT * FROM post")
      .select(
        new ListOutcome<Post>(
          new ListOutcome.Mapping<Post>() {
            @Override
            public Post map(final ResultSet rset) {
              return new ConstPost(
                new PgPost(
                  ConstPgPosts.this.dbase,
                  rset.getInt(1)
                ),
                Utc.getTimestamp(rset, 2),
                rset.getString(3)
              );
            }
          }
        )
      );
  }
}

Now all posts returned by iterate() of this new class are pre-equipped with dates and titles fetched in one round trip to the database.

Using decorators and multiple implementations of the same interface, you can compose any functionality you wish. What is the most important is that while functionality is being extended, the complexity of the design is not escalating, because classes don't grow in size. Instead, we're introducing new classes that stay cohesive and solid, because they are small.

What About Transactions?

Every object should deal with its own transactions and encapsulate them the same way as SELECT or INSERT queries. This will lead to nested transactions, which is perfectly fine provided the database server supports them. If there is no such support, create a session-wide transaction object that will accept a "callable" class. For example:

final class Txn {
  private final DataSource dbase;
  public <T> T call(Callable<T> callable) {
    JdbcSession session = new JdbcSession(this.dbase);
    try {
      session.sql("START TRANSACTION").exec();
      T result = callable.call();
      session.sql("COMMIT").exec();
      return result;
    } catch (Exception ex) {
      session.sql("ROLLBACK").exec();
      throw ex;
    }
  }
}

Then, when you want to wrap a few object manipulations in one transaction, do it like this:

new Txn(dbase).call(
  new Callable<Integer>() {
    @Override
    public Integer call() {
      Posts posts = new PgPosts(dbase);
      Post post = posts.add(
        new Date(), "How to cook an omelette"
      );
      posts.comments().post("This is my first comment!");
      return post.id();
    }
  }
);

This code will create a new post and post a comment to it. If one of the calls fail, the entire transaction will be rolled back.

This approach looks object-oriented to me. I'm calling it "SQL-speaking objects," because they know how to speak SQL with the database server. It's their skill, perfectly encapsulated inside their borders.

© Yegor Bugayenko 2014–2018

Five Principles of Bug Tracking

QR code

Five Principles of Bug Tracking

  • comments

A team working remotely requires much stronger discipline than a co-located crew sitting in the same office. First of all, I mean discipline of communications. At Zerocracy, we have developed software remotely for the last five years. We manage tasks strictly through ticketing systems (like GitHub, JIRA, Trac, Basecamp, etc.) and discourage any informal communications, like Skype, HipChat, emails, or phone calls. Every ticket for us is an isolated task with its own life cycle, its own participants, and its own goal. Over these years, we've learned a few lessons that I want to share. If you also work remotely with your team, you may find them useful.

Monty Python Flying Circus, TV Series (1969-1974)
Monty Python Flying Circus, TV Series (1969-1974)

1. Keep It One-on-One

Each ticket (aka "bug") is a link between two people: problem specifier and problem solver. If it is a bug, I'm reporting it---you're solving it. If it is a question, I'm asking for an explanation---you're explaining. If it is a task, I'm ordering you to do it---you're doing it. In any case, there are two main characters. No matter how many people are involved in the ticket resolution, only these two characters have formal roles.

The responsibility of the ticket reporter is to defend the problem. When I report a bug, I have to insist that it exists---this is my job. Others may tell me that I'm wrong and the bug is not there. They may tell me that they can't reproduce it. They may say that my description of a task is too vague and nobody understands it. There may be many issues of that kind. My job is to do the best I can in order to keep the ticket alive. Obviously, if the bug is not reproducible, I'll be forced to close the ticket. However, until the ticket is closed, I'm its guardian angel. :)

On the other hand, the responsibility of the ticket solver is to defend the solution. When a ticket is assigned to me and I have to resolve it, my job is to convince the reporter that my solution is good enough. He may tell me that my solution is not sufficient, not the most efficient, or incomplete. My job is to insist that I'm right and he is wrong. Well, of course, in a reasonable way. And in order to create a solution that will be accepted as sufficient enough, I have to understand the problem first, investigate all possible options, and propose the most elegant implementation. But all this is secondary. The first thing I will be focused on is how to convince the reporter. I will always remember that my primary goal is to close the ticket.

My point here is that no matter how many people are involved in the ticket discussion, always remember what is happening there---one person is selling his solution to another person. Everybody else around them is help or distraction (see below).

2. Close It!

Remember that a ticket is not a chat. You're not there to talk. You're there to close. When the ticket is assigned to you, focus on closing it as soon as possible (Always Be Closing, remember?).

Also, keep in mind that the sooner you close the ticket, the better job you will do for the project. Long-living tickets are a management nightmare. It is difficult to track them and control them. It's difficult to understand what's going on. Have you seen those two-year-old tickets in open source projects that have hundreds of comments and no deliverables? It is a mistake by their project managers and ticket participants. Each ticket should be short and focused---1) a problem, 2) a refinement question, 3) a short explanation, 4) a solution, 5) closed, thanks everybody. This is an ideal scenario.

As soon as you realize that your ticket is turning into a long discussion, try to close it even faster. How can I close it if the reporter doesn't like my solution? Find a temporary solution that will satisfy the reporter and allow you to close the ticket. Use "TODO" in your code or dirty workarounds---they are all better than a ticket hovering for a long time.

Once you see that the solution is provided and is sufficient enough to close the ticket, ask its reporter to close it. Explicitly ask for that; don't dance around with "looks like this solution may be accepted, if you don't mind." Be explicit in your intention to close the ticket and move on. Try this: "@jeff, please close the ticket if you don't have any further questions."

3. Don't Close It!

Every time you raise a bug and create a new ticket, you consume project resources. Every bug report means money spent on the project: 1) money for your time spent finding the problem and reporting it; 2) for the time of the project manager who is working with the ticket and finding who will fix it; 3) for the time of the ticket solver, who is trying to understand your report and provide a solution; and also 4) for the time of everybody else who will participate in the discussion.

If you close the ticket without a problem being properly solved, you put this money into the trash bin. Once the ticket is started, there is no way back. We can't just say, "Nah, ignore it; it's not important anymore." Your ticket already consumed project time and budget resources, and in order to turn them into something useful, you have to make sure that some solution is delivered.

It can be a temporary solution. It can be a single line change in the project documentation. It can be a TODO marker in the code saying that "we are aware of the problem but won't fix it because we're lazy." Anything would work, but not nothing.

Look at it from a different perspective. When you started that ticket, you had something in mind. Something was not right with the product. That's why you reported a bug. If you close the ticket without anyone even touching that place of code, someone else will have the same concern in a few days or a few years. And then the project will have to pay again for a similar ticket or discussion of the same problem. Even if you're convinced that the issue you found in the code is not really an issue, ask a ticket resolver to document it right in the source code in order to prevent such confusion from happening again in the future.

4. Avoid Noise---Address Your Comments

Every time you post a message to the ticket, address it to someone. Otherwise, if you post just because you want to express your opinion, your comments become communication noise. Remember, a ticket is a conversation between two people---one of them reported an issue and the other one is trying to fix it. Comments like, "How about we try another approach" or "I remember I had a similar issue some time ago" are very annoying and distracting. Let's be honest, nobody really needs or cares about "opinions." All we need in a ticket is a solution(s).

If you think the ticket should be closed because the introduced solution is good enough, address your comment to the ticket reporter. And start it with "@jeff, I think the solution you've got already is good enough, because..." This way, you will help the assigned to close the ticket and move on.

If you think the solution is wrong, address your comment to the assigned of the ticket, starting with "@jeff, I believe your solution is not good enough because..." This way, you will help the ticket reporter keep the ticket open until a proper solution comes up.

Again, don't pollute the air with generic opinions. Instead, be very specific and take sides---you either like the solution and want the ticket to be closed, or you don't like it and want the ticket to stay open. Everything in between is just making the situation more complex and isn't helping the project at all.

5. Report When It Is Broken

I think it is obvious, but I will reiterate: Every bug has to be reproducible. Every time you report a bug, you should explain how exactly the product is broken. Yes, it is your job to prove that the software doesn't work as intended, or is not documented properly, or doesn't satisfy the requirements, etc.

Every bug report should follow the same simple formula: "This is what we have, this is what we should have instead, so fix it." Every ticket, be it a bug, a task, a question, or a suggestion, should be formatted in this way. By submitting it, you're asking the project to move from point A to point B. Something is not right at point A, and it will be much better for all of us to be at that point B. So it's obvious that you have to explain where these points A and B are. It is highly desirable if you can explain how to get there---how to reproduce a problem and how to fix it.

Even when you have a question, you should also follow that format. If you have a question, it means the project documentation is not sufficient enough for you to find an answer there. This is what is broken. You should ask for a fix. So instead of reporting, "How should I use class X?," say something like, "The current documentation is not complete; it doesn't explain how I should use class X. Please fix."

If you can't explain how to get there, say so in the ticket: "I see that this class doesn't work as it should, but I don't know how to reproduce the problem and how to fix it." This will give everybody a clear message that you are aware that your bug report is not perfect. The first step for its resolver will be to refine the problem and find a way to reproduce it. If such a replica can't be found, obviously your bug will be forced into closing.

Let me reiterate again: Every ticket is dragging the project from point A, where something is not right, to point B, where it is fixed. Your job, as a ticket reporter, is to draw that line---clearly and explicitly.

© Yegor Bugayenko 2014–2018

Seven Virtues of a Good Object

QR code

Seven Virtues of a Good Object

  • comments

Martin Fowler says:

A library is essentially a set of functions that you can call, these days usually organized into classes.

Functions organized into classes? With all due respect, this is wrong. And it is a very common misconception of a class in object-oriented programming. Classes are not organizers of functions. And objects are not data structures.

So what is a "proper" object? Which one is not a proper one? What is the difference? Even though it is a very polemic subject, it is very important. Unless we understand what an object is, how can we write object-oriented software? Well, thanks to Java, Ruby, and others, we can. But how good will it be? Unfortunately, this is not an exact science, and there are many opinions. Here is my list of qualities of a good object.

Class vs. Object

badge

Before we start talking about objects, let's define what a class is. It is a place where objects are being born (a.k.a. instantiated). The main responsibility of a class is to construct new objects on demand and destruct them when they are not used anymore. A class knows how its children should look and how they should behave. In other words, it knows what contracts they should obey.

Sometimes I hear classes being called "object templates" (for example, Wikipedia says so). This definition is not correct because it places classes into a passive position. This definition assumes that someone will get a template and build an object by using it. This may be true, technically speaking, but conceptually it's wrong. Nobody else should be involved---there are only a class and its children. An object asks a class to create another object, and the class constructs it; that's it. Ruby expresses this concept much better than Java or C++:

photo = File.new('/tmp/photo.png')

The object photo is constructed by the class File (new is an entry point to the class). Once constructed, the object is acting on its own. It shouldn't know who constructed it and how many more brothers and sisters it has in the class. Yes, I mean that reflection is a terrible idea, but I'll write more about it in one of the next posts :) Now, let's talk about objects and their best and worst sides.

1. He Exists in Real Life

badge

First of all, an object is a living organism. Moreover, an object should be anthropomorphized, i.e. treated like a human being (or a pet, if you like them more). By this I basically mean that an object is not a data structure or a collection of functions. Instead, it is an independent entity with its own life cycle, its own behavior, and its own habits.

An employee, a department, an HTTP request, a table in MySQL, a line in a file, or a file itself are proper objects---because they exist in real life, even when our software is turned off. To be more precise, an object is a representative of a real-life creature. It is a proxy of that real-life creature in front of all other objects. Without such a creature, there is---obviously---no object.

photo = File.new('/tmp/photo.png')
puts photo.width()

In this example, I'm asking File to construct a new object photo, which will be a representative of a real file on disk. You may say that a file is also something virtual and exists only when the computer is turned on. I would agree and refine the definition of "real life" as follows: It is everything that exists aside from the scope of the program the object lives in. The disk file is outside the scope of our program; that's why it is perfectly correct to create its representative inside the program.

A controller, a parser, a filter, a validator, a service locator, a singleton, or a factory are not good objects (yes, most GoF patterns are anti-patterns!). They don't exist apart from your software, in real life. They are invented just to tie other objects together. They are artificial and fake creatures. They don't represent anyone. Seriously, an XML parser---who does it represent? Nobody.

Some of them may become good if they change their names; others can never excuse their existence. For example, that XML parser can be renamed to "parseable XML" and start to represent an XML document that exists outside of our scope.

Always ask yourself, "What is the real-life entity behind my object?" If you can't find an answer, start thinking about refactoring.

2. He Works by Contracts

badge

A good object always works by contracts. He expects to be hired not because of his personal merits but because he obeys the contracts. On the other hand, when we hire an object, we shouldn't discriminate and expect some specific object from a specific class to do the work for us. We should expect any object to do what our contract says. As long as the object does what we need, we should not be interested in his class of origin, his sex, or his religion.

For example, I need to show a photo on the screen. I want that photo to be read from a file in PNG format. I'm contracting an object from class DataFile and asking him to give me the binary content of that image.

But wait, do I care where exactly the content will come from---the file on disk, or an HTTP request, or maybe a document in Dropbox? Actually, I don't. All I care about is that some object gives me a byte array with PNG content. So my contract would look like this:

interface Binary {
  byte[] read();
}

Now, any object from any class (not just DataFile) can work for me. All he has to do, in order to be eligible, is to obey the contract---by implementing the interface Binary.

The rule here is simple: every public method in a good object should implement his counterpart from an interface. If your object has public methods that are not inherited from any interface, he is badly designed.

There are two practical reasons for this. First, an object working without a contract is impossible to mock in a unit test. Second, a contract-less object is impossible to extend via decoration.

3. He Is Unique

A good object should always encapsulate something in order to be unique. If there is nothing to encapsulate, an object may have identical clones, which I believe is bad. Here is an example of a bad object, which may have clones:

class HTTPStatus implements Status {
  private URL page = new URL("http://localhost");
  @Override
  public int read() throws IOException {
    return HttpURLConnection.class.cast(
      this.page.openConnection()
    ).getResponseCode();
  }
}

I can create a few instances of class HTTPStatus, and all of them will be equal to each other:

first = new HTTPStatus();
second = new HTTPStatus();
assert first.equals(second);

Obviously utility classes, which have only static methods, can't instantiate good objects. More generally, utility classes don't have any of the merits mentioned in this article and can't even be called "classes." They are simply terrible abusers of an object paradigm and exist in modern object-oriented languages only because their inventors enabled static methods.

4. He Is Immutable

A good object should never change his encapsulated state. Remember, an object is a representative of a real-life entity, and this entity should stay the same through the entire life of the object. In other words, an object should never betray those whom he represents. He should never change owners. :)

Be aware that immutability doesn't mean that all methods always return the same values. Instead, a good immutable object is very dynamic. However, he never changes his internal state. For example:

@Immutable
final class HTTPStatus implements Status {
  private URL page;
  public HTTPStatus(URL url) {
    this.page = url;
  }
  @Override
  public int read() throws IOException {
    return HttpURLConnection.class.cast(
      this.page.openConnection()
    ).getResponseCode();
  }
}

Even though the method read() may return different values, the object is immutable. He points to a certain web page and will never point anywhere else. He will never change his encapsulated state, and he will never betray the URL he represents.

Why is immutability a virtue? This article explains in detail: Objects Should Be Immutable. In a nutshell, immutable objects are better because:

  • Immutable objects are simpler to construct, test, and use.
  • Truly immutable objects are always thread-safe.
  • They help avoid temporal coupling.
  • Their usage is side-effect free (no defensive copies).
  • They always have failure atomicity.
  • They are much easier to cache.
  • They prevent NULL references.

Of course, a good object doesn't have setters, which may change his state and force him to betray the URL. In other words, introducing a setURL() method would be a terrible mistake in class HTTPStatus.

Besides all that, immutable objects will force you to make more cohesive, solid, and understandable designs, as this article explains: How Immutability Helps.

5. His Class Doesn't Have Anything Static

A static method implements a behavior of a class, not an object. Let's say we have class File, and his children have method size():

final class File implements Measurable {
  @Override
  public int size() {
    // calculate the size of the file and return
  }
}

So far, so good; the method size() is there because of the contract Measurable, and every object of class File will be able to measure his size. A terrible mistake would be to design this class with a static method instead (this design is also known as a utility class and is very popular in Java, Ruby, and almost every OOP language):

// TERRIBLE DESIGN, DON'T USE!
class File {
  public static int size(String file) {
    // calculate the size of the file and return
  }
}

This design runs completely against the object-oriented paradigm. Why? Because static methods turn object-oriented programming into "class-oriented" programming. This method, size(), exposes the behavior of the class, not of his objects. What's wrong with this, you may ask? Why can't we have both objects and classes as first-class citizens in our code? Why can't both of them have methods and properties?

The problem is that with class-oriented programming, decomposition doesn't work anymore. We can't break down a complex problem into parts, because only a single instance of a class exists in the entire program. The power of OOP is that it allows us to use objects as an instrument for scope decomposition. When I instantiate an object inside a method, he is dedicated to my specific task. He is perfectly isolated from all other objects around the method. This object is a local variable in the scope of the method. A class, with his static methods, is always a global variable no matter where I use him. Because of that, I can't isolate my interaction with this variable from others.

Besides being conceptually against object-oriented principles, public static methods have a few practical drawbacks:

First, it's impossible to mock them (Well, you can use PowerMock, but this will then be the most terrible decision you could make in a Java project... I made it once, a few years ago).

Second, they are not thread-safe by definition, because they always work with static variables, which are accessible from all threads. You can make them thread-safe, but this will always require explicit synchronization.

Every time you see a public static method, start rewriting immediately. I don't even want to mention how terrible static (or global) variables are. I think it is just obvious.

6. His Name Is Not a Job Title

badge

The name of an object should tell us what this object is, not what it does, just like we name objects in real life: book instead of page aggregator, cup instead of water holder, T-shirt instead of body dresser. There are exceptions, of course, like printer or computer, but they were invented just recently and by those who didn't read this article. :)

For example, these names tell us who their owners are: an apple, a file, a series of HTTP requests, a socket, an XML document, a list of users, a regular expression, an integer, a PostgreSQL table, or Jeffrey Lebowski. A properly named object is always possible to draw as a small picture. Even a regular expression can be drawn.

In the opposite, here is an example of names that tell us what their owners do: a file reader, a text parser, a URL validator, an XML printer, a service locator, a singleton, a script runner, or a Java programmer. Can you draw any of them? No, you can't. These names are not suitable for good objects. They are terrible names that lead to terrible design.

In general, avoid names that end with "-er"---most of them are bad.

"What is the alternative of a FileReader?" I hear you asking. What would be a better name? Let's see. We already have File, which is a representative of a real-world file on disk. This representative is not powerful enough for us, because he doesn't know how to read the content of the file. We want to create a more powerful one that will have that ability. What would we call him? Remember, the name should say what he is, not what he does. What is he? He is a file that has data; not just a file, like File, but a more sophisticated one, with data. So how about FileWithData or simply DataFile?

The same logic should be applicable to all other names. Always think about what it is rather than what it does. Give your objects real, meaningful names instead of job titles.

More about this in Don't Create Objects That End With -ER.

7. His Class Is Either Final or Abstract

badge

A good object comes from either a final or abstract class. A final class is one that can't be extended via inheritance. An abstract class is one that can't have instances. Simply put, a class should either say, "You can never break me; I'm a black box for you" or "I'm broken already; fix me first and then use."

There is nothing in between. A final class is a black box that you can't modify by any means. He works as he works, and you either use him or throw him away. You can't create another class that will inherit his properties. This is not allowed because of that final modifier. The only way to extend such a final class is through decoration of his children. Let's say I have the class HTTPStatus (see above), and I don't like him. Well, I like him, but he's not powerful enough for me. I want him to throw an exception if HTTP status is over 400. I want his method, read(), to do more that it does now. A traditional way would be to extend the class and overwrite his method:

class OnlyValidStatus extends HTTPStatus {
  public OnlyValidStatus(URL url) {
    super(url);
  }
  @Override
  public int read() throws IOException {
    int code = super.read();
    if (code >= 400) {
      throw new RuntimeException("Unsuccessful HTTP code");
    }
    return code;
  }
}

Why is this wrong? It is very wrong because we risk breaking the logic of the entire parent class by overriding one of his methods. Remember, once we override the method read() in the child class, all methods from the parent class start to use his new version. We're literally injecting a new "piece of implementation" right into the class. Philosophically speaking, this is an offense.

On the other hand, to extend a final class, you have to treat him like a black box and decorate him with your own implementation (a.k.a. Decorator Pattern):

final class OnlyValidStatus implements Status {
  private final Status origin;
  public OnlyValidStatus(Status status) {
    this.origin = status;
  }
  @Override
  public int read() throws IOException {
    int code = this.origin.read();
    if (code >= 400) {
      throw new RuntimeException("Unsuccessful HTTP code");
    }
    return code;
  }
}

Make sure that this class is implementing the same interface as the original one: Status. The instance of HTTPStatus will be passed into him through the constructor and encapsulated. Then every call will be intercepted and implemented in a different way, if necessary. In this design, we treat the original object as a black box and never touch his internal logic.

If you don't use that final keyword, anyone (including yourself) will be able to extend the class and... offend him :( So a class without final is a bad design.

An abstract class is the exact opposite case---he tells us that he is incomplete and we can't use him "as is." We have to inject our custom implementation logic into him, but only into the places he allows us to touch. These places are explicitly marked as abstract methods. For example, our HTTPStatus may look like this:

abstract class ValidatedHTTPStatus implements Status {
  @Override
  public final int read() throws IOException {
    int code = this.origin.read();
    if (!this.isValid()) {
      throw new RuntimeException("Unsuccessful HTTP code");
    }
    return code;
  }
  protected abstract boolean isValid();
}

As you see, the class doesn't know how exactly to validate the HTTP code, and he expects us to inject that logic through inheritance and through overriding the method isValid(). We're not going to offend him with this inheritance, since he defended all other methods with final (pay attention to the modifiers of his methods). Thus, the class is ready for our offense and is perfectly guarded against it.

To summarize, your class should either be final or abstract---nothing in between.

Update (April 2017): If you also agree that implementation inheritance is evil, all your classes must be final.

© Yegor Bugayenko 2014–2018

Hits-of-Code Instead of SLoC

QR code

Hits-of-Code Instead of SLoC

  • comments

Lines-of-Code (aka SLoC) is a metric with a terrible reputation. Try to google it yourself and you'll find tons of articles bad-mouthing about its counter-effectiveness and destructiveness for a software development process. The main argument is that we can't measure the progress of programming by the number of lines of code written. Probably the most famous quote is attributed to Bill Gates:

Measuring programming progress by lines of code is like measuring aircraft building progress by weight

Basically, this means that certain parts of the aircraft will take much more effort at the same time being much lighter than others (like a central computer, for example). Instead of measuring the weight of the aircraft we should measure the effort put into it... somehow. So, here is the idea. How about we measure the amount of times programmers touch the lines. Instead of counting the number of lines we'll count how many times they were actually modified---we can get this information from Git (or any other SCM). The more you touch that part of the aircraft---the more effort you spent on it, right?

I called it Hits-of-Code (HoC) and created a small tool to help us calculate this number in just one line. It's a Ruby gem, install it and run:

$ gem install hoc
$ hoc
54687

The number 54687 is a total number of Hits-of-Code in your code base. The principle behind this number is primitive---every time a line of code is modified, created or deleted in a Git commit, the counter increments.

The main reason why this metric is better than LoC is that it is much better aligned with the actual effort invested into the code base. Here is why.

It Always Increments

The HoC metric always goes up. Today it can not be lower than it was yesterday---just like the effort, it always increments. Lines-of-Code is not acting like this. You may have a huge code base today, but after refactoring it will become much smaller. The number of lines of code is decreased. Does it mean you are less effective? Definitely not, but the LoC metric says so, to a non-programmer. A project manager, for example, may decide that since the size of the code base stayed the same over the last month, the team is not working.

HoC doesn't have this counter-intuitive effect. Instead, HoC grows together with your every commit. The more you work on the code base, the bigger the HoC. It doesn't matter how big or small the absolute size of the your product. What matters is how much effort you put into it. That's why HoC is very intuitive and may be used as a measurement of software development progress.

Look at this 18-month graph; it shows both metrics together. I used the same Java code base of rultor, a DevOps assistant. The code base experienced a major refactoring a few months ago, as you see on the graph. I think it is obvious which metric on this graph tells us more about the efforts being invested into the product.

It Is Objective

For HoC it doesn't matter how big the absolute size of the code base, but only how big your relative contribution to it.

Let's say, you have 300K lines of code and 95% of them were copy-pasted from some third-party libraries (by the way, it is a very common and terrible practice---to keep third-party code inside your own repository). The amount of lines of code will be big, but the actual custom code part will be relatively small. Thus, the LoC metric will be misleading---it will always show 300K with small increments or decrements around it. Everybody will have a feeling that the team is working with 300K lines code base.

On the other hand, HoC will always take into account the part of code that is actually being modified. The value of HoC will be objectively correlated with the actual effort of programmers working with the code base.

It Exposes Complexity of Lines

LoC is usually criticized for its neutrality towards code complexity. An auto-generated ORM class or a complex sorting algorithm may have the same size in terms of lines of code, but the first takes seconds to write, while the second may take weeks or months. That's why lines of code is usually considered a false metric.

Hits-of-Code takes complexity into account, because the longer you work with that sorting algorithm the more modifications you make to its lines. Well, this statement is true if you use Git regularly and commit your changes frequently---that is how you tell Git about your work progress.

Conclusion

Finally, look at this list of open projects completed by our team over the last few years. Every project has two metrics: Lines-of-Code and Hits-of-Code. It is interesting to see how relatively small projects have very big (over a million) HoC numbers. This immediately reminds me how much time we invested into it and how old they are.

I used the HoC metric in this analysis: How much do you pay per line of code?. That post compares a traditional project that paid $3.98 per HoC and an open source one, managed by Zerocracy, that paid ¢13.

My conclusion is that this Hits-of-Code metric can be used as a tool of progress tracking in a software development project. Moreover, it can be used for estimations of team size, project budget, development schedule and so forth. Obviously, LoC can't be the only metric, but in combination with others it may greatly help in estimating, planning and tracking.

© Yegor Bugayenko 2014–2018

How Immutability Helps

QR code

How Immutability Helps

  • comments

In a few recent posts, including Getters/Setters. Evil. Period. Objects Should Be Immutable, and Dependency Injection Containers are Code Polluters, I universally labeled all mutable objects with "setters" (object methods starting with set) evil. My argumentation was based mostly on metaphors and abstract examples. Apparently, this wasn't convincing enough for many of you---I received a few requests asking to provide more specific and practical examples.

Thus, in order to illustrate my strongly negative attitude to "mutability via setters," I took an existing commons-email Java library from Apache and re-designed it my way, without setters and with "object thinking" in mind. I released my library as part of the jcabi family---jcabi-email. Let's see what benefits we get from a "pure" object-oriented and immutable approach, without getters.

Here is how your code will look, if you send an email using commons-email:

Email email = new SimpleEmail();
email.setHostName("smtp.googlemail.com");
email.setSmtpPort(465);
email.setAuthenticator(new DefaultAuthenticator("user", "pwd"));
email.setFrom("yegor256@gmail.com", "Yegor Bugayenko");
email.addTo("dude@jcabi.com");
email.setSubject("how are you?");
email.setMsg("Dude, how are you?");
email.send();

Here is how you do the same with jcabi-email:

Postman postman = new Postman.Default(
  new SMTP("smtp.googlemail.com", 465, "user", "pwd")
);
Envelope envelope = new Envelope.MIME(
  new Array<Stamp>(
    new StSender("Yegor Bugayenko <yegor256@gmail.com>"),
    new StRecipient("dude@jcabi.com"),
    new StSubject("how are you?")
  ),
  new Array<Enclosure>(
    new EnPlain("Dude, how are you?")
  )
);
postman.send(envelope);

I think the difference is obvious.

In the first example, you're dealing with a monster class that can do everything for you, including sending your MIME message via SMTP, creating the message, configuring its parameters, adding MIME parts to it, etc. The Email class from commons-email is really a huge class---33 private properties, over a hundred methods, about two thousands lines of code. First, you configure the class through a bunch of setters and then you ask it to send() an email for you.

In the second example, we have seven objects instantiated via seven new calls. Postman is responsible for packaging a MIME message; SMTP is responsible for sending it via SMTP; stamps (StSender, StRecipient, and StSubject) are responsible for configuring the MIME message before delivery; enclosure EnPlain is responsible for creating a MIME part for the message we're going to send. We construct these seven objects, encapsulating one into another, and then we ask the postman to send() the envelope for us.

What's Wrong With a Mutable Email?

From a user perspective, there is almost nothing wrong. Email is a powerful class with multiple controls---just hit the right one and the job gets done. However, from a developer perspective Email class is a nightmare. Mostly because the class is very big and difficult to maintain.

Because the class is so big, every time you want to extend it by introducing a new method, you're facing the fact that you're making the class even worse---longer, less cohesive, less readable, less maintainable, etc. You have a feeling that you're digging into something dirty and that there is no hope to make it cleaner, ever. I'm sure, you're familiar with this feeling---most legacy applications look that way. They have huge multi-line "classes" (in reality, COBOL programs written in Java) that were inherited from a few generations of programmers before you. When you start, you're full of energy, but after a few minutes of scrolling such a "class" you say---"screw it, it's almost Saturday."

Because the class is so big, there is no data hiding or encapsulation any more---33 variables are accessible by over 100 methods. What is hidden? This Email.java file in reality is a big, procedural 2000-line script, called a "class" by mistake. Nothing is hidden, once you cross the border of the class by calling one of its methods. After that, you have full access to all the data you may need. Why is this bad? Well, why do we need encapsulation in the first place? In order to protect one programmer from another, aka defensive programming. While I'm busy changing the subject of the MIME message, I want to be sure that I'm not interfered with by some other method's activity, that is changing a sender and touching my subject by mistake. Encapsulation helps us narrow down the scope of the problem, while this Email class is doing exactly the opposite.

Because the class is so big, its unit testing is even more complicated than the class itself. Why? Because of multiple inter-dependencies between its methods and properties. In order to test setCharset() you have to prepare the entire object by calling a few other methods, then you have to call send() to make sure the message being sent actually uses the encoding you specified. Thus, in order to test a one-line method setCharset() you run the entire integration testing scenario of sending a full MIME message through SMTP. Obviously, if something gets changed in one of the methods, almost every test method will be affected. In other words, tests are very fragile, unreliable and over-complicated.

I can go on and on with this "because the class is so big," but I think it is obvious that a small, cohesive class is always better than a big one. It is obvious to me, to you, and to any object-oriented programmer. But why is it not so obvious to the developers of Apache Commons Email? I don't think they are stupid or un-educated. What is it then?

How and Why Did It Happen?

This is how it always happens. You start to design a class as something cohesive, solid, and small. Your intentions are very positive. Very soon you realize that there is something else that this class has to do. Then, something else. Then, even more.

The best way to make your class more and more powerful is by adding setters that inject configuration parameters into the class so that it can process them inside, isn't it?

This is the root cause of the problem! The root cause is our ability to insert data into mutable objects via configuration methods, also known as "setters." When an object is mutable and allows us to add setters whenever we want, we will do it without limits.

Let me put it this way---mutable classes tend to grow in size and lose cohesiveness.

If commons-email authors made this Email class immutable in the beginning, they wouldn't have been able to add so many methods into it and encapsulate so many properties. They wouldn't be able to turn it into a monster. Why? Because an immutable object only accepts a state through a constructor. Can you imagine a 33-argument constructor? Of course, not.

When you make your class immutable in the first place, you are forced to keep it cohesive, small, solid and robust. Because you can't encapsulate too much and you can't modify what's encapsulated. Just two or three arguments of a constructor and you're done.

How Did I Design An Immutable Email?

badge

When I was designing jcabi-email I started with a small and simple class: Postman. Well, it is an interface, since I never make interface-less classes. So, Postman is... a post man. He is delivering messages to other people. First, I created a default version of it (I omit the ctor, for the sake of brevity):

import javax.mail.Message;
@Immutable
class Postman.Default implements Postman {
  private final String host;
  private final int port;
  private final String user;
  private final String password;
  @Override
  void send(Message msg) {
    // create SMTP session
    // create transport
    // transport.connect(this.host, this.port, etc.)
    // transport.send(msg)
    // transport.close();
  }
}

Good start, it works. What now? Well, the Message is difficult to construct. It is a complex class from JDK that requires some manipulations before it can become a nice HTML email. So I created an envelope, which will build this complex object for me (pay attention, both Postman and Envelope are immutable and annotated with @Immutable from jcabi-aspects):

@Immutable
interface Envelope {
  Message unwrap();
}

I also refactor the Postman to accept an envelope, not a message:

@Immutable
interface Postman {
  void send(Envelope env);
}

So far, so good. Now let's try to create a simple implementation of Envelope:

@Immutable
class MIME implements Envelope {
  @Override
  public Message unwrap() {
    return new MimeMessage(
      Session.getDefaultInstance(new Properties())
    );
  }
}

It works, but it does nothing useful yet. It only creates an absolutely empty MIME message and returns it. How about adding a subject to it and both To: and From: addresses (pay attention, MIME class is also immutable):

@Immutable
class Envelope.MIME implements Envelope {
  private final String subject;
  private final String from;
  private final Array<String> to;
  public MIME(String subj, String sender, Iterable<String> rcpts) {
    this.subject = subj;
    this.from = sender;
    this.to = new Array<String>(rcpts);
  }
  @Override
  public Message unwrap() {
    Message msg = new MimeMessage(
      Session.getDefaultInstance(new Properties())
    );
    msg.setSubject(this.subject);
    msg.setFrom(new InternetAddress(this.from));
    for (String email : this.to) {
      msg.setRecipient(
        Message.RecipientType.TO,
        new InternetAddress(email)
      );
    }
    return msg;
  }
}

Looks correct and it works. But it is still too primitive. How about CC: and BCC:? What about email text? How about PDF enclosures? What if I want to specify the encoding of the message? What about Reply-To?

Can I add all these parameters to the constructor? Remember, the class is immutable and I can't introduce the setReplyTo() method. I have to pass the replyTo argument into its constructor. It's impossible, because the constructor will have too many arguments, and nobody will be able to use it.

So, what do I do?

Well, I started to think: how can we break the concept of an "envelope" into smaller concepts---and this what I invented. Like a real-life envelope, my MIME object will have stamps. Stamps will be responsible for configuring an object Message (again, Stamp is immutable, as well as all its implementers):

@Immutable
interface Stamp {
  void attach(Message message);
}

Now, I can simplify my MIME class to the following:

@Immutable
class Envelope.MIME implements Envelope {
  private final Array<Stamp> stamps;
  public MIME(Iterable<Stamp> stmps) {
    this.stamps = new Array<Stamp>(stmps);
  }
  @Override
  public Message unwrap() {
    Message msg = new MimeMessage(
      Session.getDefaultInstance(new Properties())
    );
    for (Stamp stamp : this.stamps) {
      stamp.attach(msg);
    }
    return msg;
  }
}

Now, I will create stamps for the subject, for To:, for From:, for CC:, for BCC:, etc. As many stamps as I like. The class MIME will stay the same---small, cohesive, readable, solid, etc.

What is important here is why I made the decision to refactor while the class was relatively small. Indeed, I started to worry about these stamp classes when my MIME class was just 25 lines in size.

That is exactly the point of this article---immutability forces you to design small and cohesive objects.

Without immutability, I would have gone the same direction as commons-email. My MIME class would grow in size and sooner or later would become as big as Email from commons-email. The only thing that stopped me was the necessity to refactor it, because I wasn't able to pass all arguments through a constructor.

Without immutability, I wouldn't have had that motivator and I would have done what Apache developers did with commons-email---bloat the class and turn it into an unmaintainable monster.

That's jcabi-email. I hope this example was illustrative enough and that you will start writing cleaner code with immutable objects.

© Yegor Bugayenko 2014–2018

An Empty Line is a Code Smell

QR code

An Empty Line is a Code Smell

  • comments

The subject may sound like a joke, but it is not. An empty line, used as a separator of instructions in an object method, is a code smell. Why? In short, because a method should not contain "parts." A method should always do one thing and its functional decomposition should be done by language constructs (for example, new methods), and never by empty lines.

Look at this Java class (it does smell, doesn't it?):

final class TextFile {
  private final File file;
  TextFile(File src) {
    this.file = src;
  }
  public int grep(Pattern regex) throws IOException {
    Collection<String> lines = new LinkedList<>();
    try (BufferedReader reader =
      new BufferedReader(new FileReader(this.file))) {
      while (true) {
        String line = reader.readLine();
        if (line == null) {
          break;
        }
        lines.add(line);
      }
    }

    int total = 0;
    for (String line : lines) {
      if (regex.matcher(line).matches()) {
        ++total;
      }
    }
    return total;
  }
}

This method first loads the content of the file. Second, it counts how many lines match the regular expression provided. So why does method grep smell? Because it does two things instead of one---it loads and it greps.

If we make a rule, to avoid empty lines in method bodies, the method will have to be refactored in order to preserve the "separation of concerns" introduced by that empty line:

final class TextFile {
  private final File file;
  TextFile(File src) {
    this.file = src;
  }
  public int grep(Pattern regex) throws IOException {
    return this.count(this.lines(), regex);
  }
  private int count(Iterable<String> lines, Pattern regex) {
    int total = 0;
    for (String line : lines) {
      if (regex.matcher(line).matches()) {
        ++total;
      }
    }
    return total;
  }
  private Iterable<String> lines() throws IOException {
    Collection<String> lines = new LinkedList<>();
    try (BufferedReader reader =
      new BufferedReader(new FileReader(this.file))) {
      while (true) {
        String line = reader.readLine();
        if (line == null) {
          break;
        }
        lines.add(line);
      }
      return lines;
    }
  }
}

I believe it is obvious that this new class has methods that are much more cohesive and readable. Now every method is doing exactly one thing, and it's easy to understand which thing it is.

This idea about avoiding empty lines is also applicable to other languages, not just Java/C++/Ruby, etc. For example, this CSS code is definitely begging for refactoring:

.container {
  width: 80%;
  margin-left: auto;
  margin-right: auto;

  font-size: 2em;
  font-weight: bold;
}

The empty line here is telling us (screaming at us, actually) that this .container class is too complex and has to be decomposed into two classes:

.wide {
  width: 80%;
  margin-left: auto;
  margin-right: auto;
}
.important {
  font-size: 2em;
  font-weight: bold;
}

Unfortunately, using empty lines to separate blocks of code is a very common habit. Moreover, very often I see empty blocks of two or even three lines, which are all playing this evil role of a separator of concerns.

Needless to say, a properly designed class must have just a few public methods and a properly designed method must have up to ten instructions (according to Bob Martin). Empty lines inside methods encourage us to break this awesome rule and turn them into multi-page poems.

Of course, it's easier to just click enter a few times and continue to code right in the same method, instead of thinking and refactoring first. This laziness will eventually lead to code that is hardly maintainable at all.

To prevent this from happening in your projects, stop using empty lines inside methods, completely. Ideally, prohibit them in your automated build. In qulice.com, a static analysis tool we're using in all Java projects, we created a custom Checkstyle check that prohibits empty lines in every method.

© Yegor Bugayenko 2014–2018

How Much Do You Cost?

QR code

How Much Do You Cost?

  • comments
badge

I'm getting a few emails every day from programmers interested in working with Zerocracy remotely. The first question I usually ask is "what is your rate?" (we pay by the hour) What surprises me is how often people incorrectly estimate themselves, in both directions.

I hear very different numbers, from $5 to $500 per hour. I never say no, but usually come up with my own hourly rate estimate. This article explains what factors I do and don't take into account. These are my personal criteria; don't take them as an industry standard. I do find them objective and logical, though---so let me explain.

Open Source Contribution

badge

This is the first and the most important characteristic of a software developer. Do you contribute to open source projects? Do you have your own open source libraries that are used by some community? Do you write code that is publicly available and used?

If you have nothing to show here, I see three possible causes.

First, you're too shy to share your code because it's crap. Obviously, this is not a good sign. Not because your code could be bad, but because you're not brave enough to face this fact and improve. In our teams we pay a lot of attention to the quality of code and most of our new team members get surprised by just how high our quality bar is. You will also be surprised. The question is whether you will be able to adapt and improve or if you will give up and quit. If you didn't share your code before and have never dealt with negative feedback, you won't feel comfortable in our projects, where quality requirements are very high.

The second possible cause is that you work from nine till five, for food, without passion. Actually, nobody manifests it that way. Instead, I often hear something like "my company doesn't pay me for open source contribution and at home I want to spend time with my family." In modern software development, most of the code we're working with is open source---libraries, frameworks, tools, instruments, etc. Almost everything you're using in your commercial projects is open source. By paying your salary your employer does already invest in open source products, because you're an active user of them. The problem is that you are not interested in becoming more active in that contribution. I see this as a lack of passion and self-motivation. Will you be an effective developer in our projects? Not at all, because our entire management model relies on self-motivation.

The last possible cause is that you don't know what to write and where to contribute, which means lack of creativity. As I mentioned above, almost everything we're using now is open source, and these tools are full of bugs and not-yet-implemented features. At the same time, you don't see any areas for improvement? You don't know what can be done better? You're not able to at least find, report and fix one bug in some open source product you're using every day? This means that you won't be able to find areas of improvement in our projects either, while we rely on your ability to discover problems creatively.

Thus, if your GitHub account is empty and your CV doesn't position you as "an active contributor to Linux kernel" (yeah, why not?), I immediately lose interest. On the other hand, when I see a 100+ stars project in your GitHub account, I get excited and ready to offer a higher rate.

Geographic Location

It is a common practice to pay higher rates to those who live in more expensive countries. When I'm getting resumes from San Francisco programmers, their rates are $70+ per hour. The same skills and experience cost $15-20 in Karachi. The reason here is the cost of living---it is much higher in the US than in Pakistan.

However, this reason doesn't sound logical to me. If you're driving a more expensive car, we have to pay you a higher salary? The same with the place to stay. You've chosen the country that you live in. You're using all the benefits of a well-developed country and you're paying for them. It's your choice. You decided to spend more money for the quality of your life---what does it have to do with me?

Want to pay $30 for a lunch? Become a better engineer. Until then, buy a hot dog for a few bucks. Just saying that "I'm already here and my lunch costs $30" is not an argument.

Thus, the more expensive the place you live, the less money stays in your pocket. For us this means that $100 will motivate a programmer from Karachi much stronger than the same $100 will motivate the same person, if she lives in San Francisco. Thus, we prefer to work with people whose expenses are lower. Our money will simply work better.

StackOverflow.com Reputation

We all know what StackOverflow has but very few people (surprisingly few!) actively contributing to it. If your profile there is empty (or you don't have one) I realize that you 1) don't have any questions to ask and 2) you have nothing to answer.

First, if you're not asking anything there, you are not growing. Your education process stopped some time ago, probably right after you got an office job. Or maybe you're too shy to ask? Or you can't describe your questions in an accurate and precise format? Or maybe all your questions already have answers? In any case, it's sad.

Second, if you're not answering, you simply have nothing to say. In most cases, this means that you're not solving complex and unique problems. You're simply wiring together well-known components and collecting your paychecks.

Very often I hear people saying that they solve most of their problems by asking their colleagues sitting next to them in the office. They say they simply don't need StackOverflow (or similar resources, if they exist) because their team is so great that any questions can be answered internally. That's good for the team and bad for you. Why? You don't have a very important skill---finding an answer in a public Internet. In our projects we discourage any horizontal communications between programmers, and you won't be able to get any help from anyone. You will be on your own and you will fail, because you are used to be patronized by someone senior, in your office.

StackOverflow is not just an indicator of how smart you are and how many upvotes you got for the "best programming joke." It is proof that you can find answers to your questions by communicating with people you don't know. It is a very important skill.

Years of Experience

badge

"I've written Java for 10 years!"---so what? This number means only one thing to me---you managed to survive in some office for ten years. Or maybe in a few offices. You managed to convince someone that he has to pay you for ten years of sitting in his building. Does it mean that you were writing something useful? Does it mean that your code was perfect? It doesn't mean any of that.

Years of experience is a false indicator. It actually may play against you, in combination with other indicators mentioned above. If your CV says that you just started to program two years ago and your GitHub and StackOverflow accounts are empty---there is still a chance you will improve. You're just in the beginning of your career. However, if your CV says that you're a "10-year seasoned architect" with zero open source contribution---this means that you're either lying about that ten years or you're absolutely useless as an architect.

My point is that the "years of experience" argument should be used very carefully. Play this card only if you have other merits. Otherwise, keep it to yourself.

Certifications

badge

Oracle, Zend, Amazon, IBM, MySQL, etc.---I'm talking about these certifications. In order to get them you should pass an exam. Not an easy one and not online. It is a real exam taken in a certification center, where you're sitting in front of a computer for a few hours, without any books or Internet access, answering questions. Rather humiliating activity for a respected software developer? Indeed. And there is a high probability of failure, which is also rather embarrassing.

It is a very good sign, if you managed to go through this. If you've done it a few times, even better. However, if you've earned no certifications in your entire career, it is for one of the following reasons:

First, you're afraid to lose. A serious certification may cost a few hundred dollars (I paid over $700 for SCEA) and you will not get a refund if you fail. If you're afraid to lose, you're afraid to fight. This means you'll chicken out in a real-life situation, where a complex problem will need to be solved.

Second, you don't invest in your profile. This most probably means that you don't want to change companies and prefer to find a peaceful office, where you can stay forever. I remember I once said to a friend of mine---"you will greatly improve your CV if you pass this certification." He answered with a smile---"I hope I won't need a CV any more, I like this company." This attitude is very beneficial for the company you're working for, but it definitely works against you.

In my experience, the best team players are those who work for themselves. Healthy individualism is a key success factor. If your primary objective is to earn for yourself (money, reputation, skills, or knowledge)---you will be very effective in our projects. Certifications in your profile is an indicator of that healthy individualism we're looking for.

Skills Variety

The more technologies or programming languages you know, the less you cost. I'm not saying that it's not possible to be an expert in many things at the same time---that's entirely possible. But let me give you a pragmatic reason why you shouldn't---competition. There are thousands of "Java7 programmers" on the market---we can easily choose whoever we need. But there are not so many "Hadoop programmers" or "XSLT designers."

If you focus on some specific area and become an expert there, your chances of finding a job are lower, but the payout will be bigger. We usually end up paying more to narrow-skilled specialists, mostly because we have no choice. If a project we're working on needs a Lucene expert, we'll find the right person and do our best to get him/her on board. Doing our best means increasing the price, in most cases.

Thus, when I hear that you're "experienced in MySQL, PostgreSQL, Oracle and SQLite" I realize that you know very little about databases.

Talks and Publications

badge

I think it is obvious that having a blog (about programming, not about your favorite cat) is a positive factor. Even better is to be an occasional speaker at conferences or meetups. When it is a blog, I pay attention to the amount of comments people leave for your articles. If it is a conference, the most important criteria is how difficult it was to get to the list of speakers.

Both blog articles and conference presentations make you much more valuable as a specialist. Mostly because these things demonstrate that some people already reviewed your work and your talent. And it was not just a single employer, but a group of other programmers and engineers. This means that we also can rely on your opinions.

Besides that, if you write and present regularly, you have a very important skill/talent---you can present your ideas in a "digestible" way. In our projects we discourage informal communications and use ticketing systems instead. In those tickets you will have to explain your ideas, questions or concerns so that everybody can understand you. Without enough presentation skills, you won't survive in your projects.

BTW, some software developers even file patents in their names---why can't you do this? Or maybe even publish a book. Why not?

Previous Employment

I usually don't pay much attention to this section of your CV. Our management model is so different from anything you can see anywhere else that it doesn't really matter how many times you were fired before and how senior of a position you have/had with your full-time employer. Even if your title is "CTO of Twitter"---it doesn't mean anything to me.

My experience tells me that the bigger the company and the higher the position in it---the further away you stay from the source code and from real technical decisions. VPs and CTOs spend most of their time on management meetings and internal politics.

I'm much more interested in what you've done over the last few years than in where you've done it and what they called you while you were doing it.

Education

BSc, MSc, PhD... do we care? Not really. Education is very similar to the "previous employment" mentioned above. It doesn't really matter where exactly you've spent those five years after school. What matters is what have you done during that time. If you have nothing to say about your activity in the university than what will the name of it tell me?

Well, of course, if it is Stanford or MIT, this will make a difference. In this case I can see that you managed to pass their graduation standards and managed to find money to study there. This is a good sign and will definitely increase your hourly rate. But if it is some mambo-jambo university from nowhere (like the one I graduated from), keep this information to yourself.

Rates

$100+ per hour we gladly pay to an expert who owns a few popular open source products; has a StackOverflow score above 20K; has certifications, articles, presentations, and maybe even patents.

$50+ per hour we pay to a professional programmer who has open source projects on his own or is an active contributor; has a StackOverflow score over 5K; is writing about software development; possesses a few certifications.

$30+ per hour we pay to a programmer who regularly contributes to open source code; is present in StackOverflow; has some certifications.

$15 per hour we pay to everybody else.

Don't get me wrong and don't take these numbers personally. The rate you're getting is a measurable metric of your professional level, not of you as a person. Besides, the level is not static, it is changing every day, and it's entirely in your hands.

I wrote this article mostly in order to motivate you to grow.

All these criteria are applicable to new members of our teams. Once you start writing some code, we measure your performance and you may get completely different numbers, see How Hourly Rate Is Calculated

BTW, illustrations you see above are created by Andreea Mironiuc.


If you like this article, you will definitely like these very relevant posts too:

Pimp Up Your Resume
Here are a few simple hints for making a software developer's resume sound bright, convincing, and right to the point.

How to Pay Programmers Less
Programmers are expensive and difficult to control; here are a few tricks to keep them underpaid and happy, for a while.

Why Don't You Contribute to Open Source?
An active open source contribution is a good habit for a software developer who is passionate about his or her job.

© Yegor Bugayenko 2014–2018

Are You a Hacker or a Designer?

QR code

Are You a Hacker or a Designer?

  • comments

Twenty years ago, the best programmer was the one capable of fitting an entire application into a 64Kb .COM file. Those who were able to get the most out of that poor Intel 80386 were the icons of programming.

That's because twenty years ago computers were expensive and programmers were cheap. That was the time of the "hacker mentality." That time is over. That mentality is not appreciated any more, because the market situation is completely opposite.

Today, computers are cheap and programmers are expensive. This is the era of the "designer mentality," when the readability of our code is much more important than its performance.

Prices vs Salaries

badge

Look at this graph. It is a comparison of two trends over the last twenty years (1994-2014). The first trend falls down and shows how much cheaper computer memory and HDD storage have become over the last twenty years.

The second trend demonstrates how much software developers' salaries escalated over the same period. More accurately, they tripled. I didn't find an official report about that, but I'm sure it's no secret to anyone that the salaries of programmers keep growing---$200,000 per year for a senior developer is not a dream any more... while twenty years ago $60K was the best offer around. I found this article very interesting about this very subject.

Basically, this means that in order to create a PHP website in 1994 we had to spend 1000 times more on hardware and three times less on programmers than we do now, in 2014. And we're talking about the same stack of technologies here. The same Linux box with an Apache HTTP Server inside.

The difference is that in 1994, if our application had performance problems because of hardware limitations, we paid $35,000 per each additional gigabyte of RAM, while in 2014 we pay $10.

In 1994 it was much more efficient to hire more programmers and ask them to optimize the code or even rewrite it, instead of buying new hardware. In 2014 the situation is exactly the opposite. It is now much cheaper to double the size of the server (especially if the server is a virtual cloud one) instead of paying salaries for optimizing the software.

In 1994 the best engineers had that "hacker mentality," while in 2014 the "designer mentality" is much more appreciated.

Hacker Mentality

Someone with a hacker mentality would call this Fibonacci Java method an "elegant code" (would you?):

public int f(int n) { return n>2?f(n-1)+f(n-2):n; }

I would highlight these qualities of a good hacker:

  • uses all known (and unknown) features of a programming language
  • discriminates others as hackers and newbies and codes for hackers
  • gets bored and frustrated by rules and standards
  • doesn't write unit tests---juniors will write them later
  • enjoys fire-fighting---that's how his talent manifests
  • prefers talks over docs, since they are much more fun
  • hates to see his code being modified by someone else
  • likes to dedicate himself to one project at a time

A hacker is a talented individual. He wants to express his talent in the software he writes. He enjoys coding and does it mostly for fun. I would say, he is married to his code and can't imagine its happy life after an eventual divorce. Code ownership is what a hacker is about---he understands himself as an "owner" of the code.

When I ask one of my hacker friends---"How will someone understand what this code does?" I almost always hear the same answer---"They will ask me!" (usually said very proudly, with a sincere smile).

Designer Mentality

Someone with a designer mentality would refactor the code above to make it easier to read. He would call this Java function an "elegant code" (how about you?):

public int fibo(final int pos) {
  final int num;
  if (pos > 2) {
    num = fibo(pos - 1) + fibo(pos - 2);
  } else {
    num = pos;
  }
  return num;
}

I think these qualities can be attributed to a good designer:

  • tends to use traditional programming techniques
  • assumes everybody is a newbie and writes accordingly
  • enjoys setting rules and following them
  • prefers docs over talks and automation over docs
  • spends most of his coding time on unit tests
  • hates fire-fighting and working over time
  • loves to see his code being modified and refactored
  • works with a few projects at the same time

A designer is a talented team player. He contributes to the team processes, standards, rules, education, and discipline as much as he contributes to the source code. He always makes sure that once he leaves the project his code and his ideas stay and work.

The highest satisfaction for a good designer is to see his code living its own life---being modified, improved, refactored and eventually retired. A designer sees himself as a parent of the code---once it is old enough to walk and talk, it has to live its own life.

The Future

If you consider yourself a hacker, I believe it's time to change. The time of hackers is fading out.

In the near future we will probably even stop thinking in terms of "hardware" and will run our applications in elastic computational platforms with unlimited amounts of memory, CPU power and storage space. We will simply pay for resource utilization and almost any performance issue will just add a few extra dollars to our monthly bills. We won't care about optimization any more.

At the same time, good software engineers will become more and more expensive and will charge $500+ per hour just to check out software and give a diagnosis. Just like good lawyers or dentists.

That's why, while developing a new software product, those who pay for it will care mostly about its maintainability. Project sponsors will understand that the best solution they can get for their money is the one that is the most readable, maintainable, and automated.

Not the fastest.

© Yegor Bugayenko 2014–2018

Paired Brackets

QR code

Paired Brackets

  • comments

Here is a notation rule I'm using in Java code: a bracket should either start/end a line or be paired on the same line.

The notation applies universally to any programming language (incl. Java, Ruby, Python, C++, PHP, etc.) where brackets are used for method/function calls.

Here is how your code will look, if you follow this "Paired Brackets" notation:

new Foo( // ends the line
  Math.max(10, 40), // open/close at the same line
  String.format(
    "hello, %s",
    new Name(
      Arrays.asList(
        "Jeff",
        "Lebowski"
      )
    )
  ) // starts the line
);

Obviously, the line with a closing bracket should start at the same indentation level as the line with its opening pair.

This is how your IDE will render the code if you follow this notation (IntelliJ IDEA):

The figure

Sublime Text will also appreciate it:

The figure

As you see, those light vertical lines at the left side of the code help you to navigate, if you follow the notation.

Those multiple closing brackets may look strange to you at the beginning---but give yourself some time and you will get used to them :)

Fluent

This is how I would recommend formatting fluent method calls (this is Java in NetBeans):

The figure

Arrays

Here is how you format an array in "Paired Brackets" notation (this is Ruby in RubyMine):

The figure

As you see, the same principle applies to square and curled brackets.

JSON

The same principle is applicable to JSON formatting. This is a small JSON document in Coda 2:

The figure

JavaScript

JavaScript should also follow the same principle. This is how your .js code would look in Atom:

The figure

Python

Finally, here is Python in PyCharm:

The figure

© Yegor Bugayenko 2014–2018

Incremental Billing

QR code

Incremental Billing

  • comments

When you hire a software developer (individual or a team), there are basically two types of contracts: fixed price or time-and-material. They are fundamentally different but the truth is that in either case---you lose.

badge

In the eXtremely Distributed Software Development (XDSD) methodology everything is different, including the way we invoice our clients. Let's see what happens in traditional contracts and what changes in XDSD, which we practice in Zerocracy.

The difference between fixed-cost and T&M is in who takes the risk of spending money and getting nothing in return. This risk is huge in the software development industry, especially in outsourcing. Over 80% of all software projects fail to achieve their objectives and about 30% of startups fail by running out of cash. However, very few programmers (if any) fail to get their monthly salaries on time.

What does this tell us?

I guess it means that in all failures you---the client---will be the loser.

Time and Material

In T&M you will simply pay and pray. If your programmers appear to be honest workaholics you may get lucky and get something done. As you can see from the numbers above, however, this is rarely the case. Don't fool yourself; there won't be any workaholics in your project. Even if you adopt micro-management and corporal punishment, your overall costs will be much higher than expected and the quality will suffer.

This is what a monthly T&M invoice will look like. You will pay for the time spent by programmers pretending to be working on your project. Well, as I said above, some of them will occasionally do something useful, but overall statistics tell us that most of that time will be wasted.

No matter how good or bad the code written during that month---you still have to pay the bill. How many more invoices you will get until the product is done? Nobody knows.

In the end---you lose.

Fixed Price

In Fixed Price you will feel secure at the beginning---"the statement of work specifies everything and the price is fixed, how can I lose?" According to the statistics above, however, programmers are much smarter than their clients. You will lose in quality. Yes, you will get something for that fixed price, but it will be a throw-away software. And when you decide to modify it, new costs will bubble up. In the end, the whole project will be ruined and your money will simply be turned into programmers' salaries. This model is even more risky than T&M, where you at least have a chance.

Once in a while you will receive an invoice with a list of milestones reached. Every milestone will contain a certain set of features implemented in the product. Keep in mind that the primary motivation of your programmers will be to do less and charge more. Every time you ask for improvements or corrections, there will be a fight about budget. You will either give up and lose a lot of money or your team will significantly jeopardize quality, in order to stay profitable.

In either case---you lose.

Incremental Billing

So, what is the solution? Is it possible to have win-win contracts with programmers?

Yes, it is. We call it "Incremental Billing."

Remember, in XDSD we work with a stream of micro-tasks, usually completed in less than an hour. Each completed task produces a new increment (aka a "release" or "version") of software. An increment could be a bug fix, a bug report, a new feature or a micro-step towards any of these.

By the end of a week you get a bill that lists every single increment delivered during the week, the amount of time spent on its development and its total cost. Every increment costs you 30-60 minutes of a programmer's time (plus our fees).

Besides that, by the end of the week, you get an updated version of a project plan, with a re-estimated budget. Thus, you see what was done so far and how much needs to be done, according to our estimate.

How does this help you not lose/waste money? Here's how:

  • you fully control your budget
  • you pay only for the work completed
  • you track the progress with few-minutes-granularity
  • you don't pay for meetings, chats, lunches or coffee breaks
  • programmers stay very motivated, since they are paid by result
  • there is no long-term commitment, and you can stop at any time
  • every increment passes all quality checks

As you can see, XDSD methodology not only improves the way we develop software but also fixes the flaws in the way you pay for it. Since it is a win-win model, it is beneficial for both programmers and for you---the paying project sponsor.

© Yegor Bugayenko 2014–2018

How We Write a Product Vision

QR code

How We Write a Product Vision

  • comments
badge

Every software project we work with is started from a Product Vision document. We create it during our Thinking phase. Even though the document is as short as two pages of English text, its development is the most painstaking task in the whole project.

There are a few tricks and recommendations which I'd like to share.

We usually design a Product Vision in four sections: product statement, stakeholders and needs, features, and quality requirements.

Product Statement

Product Statement is a one-paragraph declaration of intent, explaining to an absolute stranger what this product is about and what it is for. It is very similar to an elevator pitch. The Statement must answer these questions, preferably in this specific order:

  1. Who is the customer?
  2. What does she want?
  3. What is the market offering now?
  4. What is wrong with existing offers?
  5. How will our product fix this?

You should answer all these questions in less than 60 words altogether. If you need more words, something is wrong with your understanding of the product under development. If you can answer them in 20 words, your product will conquer the world.

By the way, don't confuse a Product Statement with a Mission, which is a much broader declaration of an overall goal of your business. You may have a hundred products but only a single mission. For example, Disney says that its mission is: "to make people happy." They've got hundreds of products that help them accomplish this mission. And each product has its own Product Statement.

I find these articles helpful: The Product Vision, Agile Artifacts: The Product Vision Statement, The Art of Agile Development: Vision.

Stakeholders and Needs

This section must list everybody whose life will be affected by the product (positively or negatively). Your list of stakeholders may include: sponsors, developers, users, competitors, government, banks, web hosting providers, Apple Store, hackers, etc.

It is very important to list both positive and negative stakeholders. If your product is going to automate some routine manual operations, don't forget that someone will be made redundant because of it. No matter how "good" your product is, there is always an "evil" side. The invention of the iPhone made millions of people happy, but also caused a lot of trouble for Nokia and Blackberry. An eventual invention of a cancer vaccine will make millions of people healthier, but will also make thousands of oncologists jobless. My point is that any project has both positive and negative stakeholders.

Each stakeholder must have a list of needs. They have to be simple and straight forward, like "earn money," "increase profit," "share photos," or "host a website."

I would recommend defining one or two needs for each stakeholder. If there are more than three, think again---do you really understand what your stakeholders need?

Your project will be considered successful if you satisfy all the needs of all your positive stakeholders and neutralize negative ones.

This Stakeholder Needs and Requirements article from SEBOK will be helpful.

Actors and Features

In this section we list actors (entities communicating with the product) and the key functionalities they use. This is the most abstract definition of functional requirements of the product. It doesn't need to be detailed. Instead, it has to be very high-level and abstract. For example, this is how our interaction with a well-known product may be described in two lines:

User can post tweets, read tweets of his friends,
  follow new friends and re-tweet their tweets.

Is it clear for a stranger what we're talking about here? Absolutely not---what is a "tweet," what does it mean to "follow" and what is a "re-tweet?" These questions have no answers in the Product Vision document, but it's clear that a user will have four main features available. All other features will be similar to those.

Twitter is a multi-billion dollar business with a multi-million dollar product. However, we managed to explain its key features in just two lines of text. You should do the same with your product. If you can't fit all its features into just two-three lines, reconsider your understanding of the product you're going to develop. Also, read about "feature bloat dilemma."

Each actor must have at least three and at most six features. If there are more, you should group them somehow. If there are less, break them into smaller and more detailed features.

Quality Requirements

This section lists all important non-functional requirements. Any product may have hundreds of quality requirements, as well as hundreds of features. However, a Product Vision document must be focused on the most important ones. Consider some examples:

Any web page must open in less than 300ms.
Total cost of ownership must be less than $5000/mo.
Mobile app must be tailored for 10+ popular screen sizes.
Mean time to recover must be less than 2 hours.
DB must be scalable up to 5Tb without cost increases.

It is also very important to keep requirements measurable (like each of these examples). Every line in this section is a message to product developers. They will read this document in order to understand what is most important to the sponsor of the project. For example, these quality requirements are useless: "user interface must be attractive," "web site must be fast" or "the system must be stable." They are not measurable or testable. All they do is distract developers. If you can't make a strict and measurable statement about your quality objectives, don't write anything. It's better to say nothing than set false or ambiguous goals here.

Try to keep this section short. There should be six quality requirements, at most.

Remove Noise

Every section must be no more than twenty lines in length. Even if you're developing a Google killer with a $50 million dollar budget, your Vision document must be as short as two pages.

For most of my clients this is a very complex and brain damaging task. They usually come to us with a 50-page document explaining their business ideas with all the important details. From this document, we should only extract information that really makes a difference.

The Product Vision document must keep its reader on the highest level of abstraction. The document must take less than a minute to read, from start to finish.

If you can't keep it short---you don't understand your product well enough.

Example

Here is an example of a very simple Product Vision for a Facebook killer:

Statement
  Facebook doesn't allow users to purchase "likes",
  our social network will have this.

Stakeholders and Needs
  Sponsor: to raise investments.
  Developer: to earn money by programming.
  Users: to share photos and purchase popularity.
  Bank: to make commission on every purchase.
  Government: to protect society against abusive content.
  Competitors: to wipe us off the market.

Actors and Features
  User can create account, upload photos, share photos,
    send personal messages, like other photos, purchase likes.
  Admin can ban user accounts, view summary reports, authorize
    credit card transactions, configure system parameters,
    monitor server resource usage.
  Bank can process credit card transactions.

Quality Requirements
  Any page must open in less than 300ms.
  Availability must be over 99.999%.
  Test coverage must be over 80%.
  Development pipeline must be fully automated.
  Interfaces must include web site and iOS/Android app.

Diplomacy

We follow all these recommendations in our projects, in Zerocracy. You can use them in your projects as well, but keep in mind that the process of defining a Product Vision could be very painful. You may sometimes offend a client by over-simplifying their "great" business idea. "Really? I am ready to pay $250,000 for something awesome and you're telling me that you've only got ten lines for it? Huh?"

To work around this situation, split the client's documentation into two parts. The first part will fit into the Product Vision document; the second one will be called "supplementary documentation" and will contain all that valuable information you've got from the client. You may use that documentation later, during the course of product development.

But don't cut corners. Don't allow your client (or anyone else) to force you to bloat the Product Vision. The document has to be very short and explicit.

No lyrics, only statements.

PS. On top of all this we place a Glossary.

© Yegor Bugayenko 2014–2018

What Does a Software Architect Do?

QR code

What Does a Software Architect Do?

  • comments

Do you have a software architect in your project? Do you need one? Well, most agile teams do not define such a role explicitly and work in a democratic mode. Every important technical decision is discussed with the entire team, and the most voted for solution wins. When such a team eventually decides to put a "software architect" badge on someone's t-shirt, the most reputable programmer gets it.

Jackie Brown (1997) by Quentin Tarantino
Jackie Brown (1997) by Quentin Tarantino

The badge rarely changes his responsibilities, though. After all, the team stays the same and enjoys having technical discussions together, involving everyone. In the end, a software architect is more of a status than a role with explicitly defined responsibilities. It is a sign of respect, paid by other team players to the oldest and the most authoritative one among them. Right?

Absolutely wrong!

Obviously, an architect is usually someone who has the most knowledge, skills, experience, and authority. Of course, an architect usually knows more than others and is able to communicate his knowledge with diplomacy and pedagogy when required. An architect is usually one of the smartest guys on the team.

This is not, however, what makes him/her an architect.

And this is not what the team needs. My definition of a software architect is this:

An architect is the one who takes the blame for the quality.

You can replace "blame" with accountability or responsibility. Although, I prefer to use "blame," because it much better emphasizes the fact that every quality issue in the product under development is a personal fault of the architect. Of course, together with the blame he also takes all the credit from happy customers, when the quality is good.

This is what the team needs---someone personally responsible for the quality of the software being developed.

How this guy will delegate this responsibility to others is his job. Whether he will use his knowledge and skills, or quality control tools, or unit testing frameworks, or authority, or coaching, or corporal punishment---it's his business. A project manager delegates quality control to the software architect, and it is up to the software architect how to delegate it further.

The role of a software architect is crucial for every project, even if there are just two coders working at the same desk. One of them has to be the architect.

An ideal architect has all the merits mentioned above. He listens to everybody and takes their opinions into account. He is a good coach and a teacher, with a lot of patience. He is an effective communicator and negotiator. He is a diplomat. And he is an expert in the technical domain.

But, even if he doesn't have all these merits, his decision is always final.

And this is the job of the project manager, to make sure that every technical decision the architect makes is not doubted by anyone. This is what delegation is all about---responsibility should always come with power.

As a project manager, you should regularly evaluate the results of your architect. Remember, the quality of the product your team is working on is his personal (!) responsibility. Any problems you see are his problems. Don't be afraid to blame him and punish him. But, always remember that in order to make your punishments productive you should give your architect full power in his actions. Let me reiterate: his decisions should be final.

If you, as a project manager, are not happy with the quality of the product and the architect doesn't improve the situation, replace him. Downgrade him to a programmer and promote one of the programmers to an architect. But always remember that there can only be one architect in the team, and that his decisions are final.

That's the only way of having a chance of building a perfect product.

© Yegor Bugayenko 2014–2018

Continuous Integration is Dead

QR code

Continuous Integration is Dead

  • comments

A few days ago, my article Why Continuous Integration Doesn't Work was published at DevOps.com. Almost the same day I received a few strongly negative critiques on Twitter.

Here is my response to the un-asked question:

Why the hell shouldn't continuous integration work, being such a brilliant and popular idea?

Even though I have some experience in this area, I won't use it as an argument. I'll try to rely only on logic instead.

BTW, my experience includes five years of using Apache Continuum, Hudson, CruiseControl, and Jenkins in over 50 open source and commercial projects. Besides that, a few years ago I created a hosted continuous integration service called fazend.com, renamed to rultor.com in 2013. Currently, I'm also an active user of Travis and AppVeyor.

How Continuous Integration Should Work

The idea is simple and obvious. Every time you make a new commit to the master branch (or /trunk in Subversion), a continuous integration server (or service) attempts to build the entire product. "Build" means compile, unit test, integration test, quality analysis, etc.

The result is either "success" or "failure." If it is a success, we say that "the build is clean." If it is a failure, we say that "the build is broken." The build usually gets broken because someone breaks it by committing new code that turns previously passing unit tests into failing ones.

This is the technical side of the problem. It always works. Well, it may have its problems, like hard-coded dependencies, lack of isolation between environments or parallel build collisions, but this article is not about those. If the application is well written and its unit tests are stable, continuous integration is easy. Technically.

Let's see the organizational side.

Continuous integration is not only a server that builds, but a management/organizational process that should "work." Being a process that works means exactly what Jez Humble said in Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation, on page 55:

Crucially, if the build fails, the development team stops whatever they are doing and fixes the problem immediately

This is what doesn't work and can't work.

Who Needs This?

As we see, continuous integration is about setting the entire development team on pause and fixing the broken build. Let me reiterate. Once the build is broken, everybody should focus on fixing it and making a commit that returns the build to the stable state.

Now, my question is---who, in an actively working team, may need this?

A product owner, who is interested in launching new features to the market as soon as possible? Or maybe a project manager, who is responsible for the deadlines? Or maybe programmers, who hate to fix bugs made by someone else, especially under pressure?

Who likes this continuous integration and who needs it?

Nobody.

What Happens In Reality?

I can tell you. I've seen it multiple times. The scenario is always the same. We just start to ignore that continuous integration build status. Either the build is clean or it is broken, and we continue to do what we were doing before.

We don't stop and fix it, as Jez Humble recommends.

Instead, we ignore the information that's coming from the continuous integration server.

Eventually, maybe tomorrow or on Monday, we'll try to find some spare time and will try to fix the build. Only because we don't like that red button on the dashboard and want to turn it into a green one.

What About Discipline?

Yes, there is another side of this coin. We can try to enforce discipline in the team. We can make it a strict rule, that our build is always clean and whoever breaks it gets some sort of a punishment.

Try doing this and you will get a fear driven development. Programmers will be afraid of committing anything to the repository because they will know that if they cause a build failure they will have to apologize, at least.

A strict discipline (which I'm a big fan of) in this case only makes the situation worse. The entire development process slows down and programmers keep their code to themselves for as long as possible, to avoid possibly broken builds. When it's time to commit, their changes are so massive that merging becomes very difficult and sometimes impossible.

As a result you get a lot of throw-away code, written by someone but never committed to master, because of that fear factor.

OK, What Is The Solution?

I wrote about it before; it is called read-only master branch.

It is simple---prohibit anyone from merging anything into master and create a script that anyone can call. The script will merge, test, and commit. The script will not make any exceptions. If any branch breaks at even one unit test, the entire branch will be rejected.

In other words: raise the red flag before the code gets into master.

This solves all problems.

First, the build is always clean. We simply can't break it because nobody can commit unless his code keeps the build clean.

Second, there is no fear of breaking anything. Simply because you technically can't do it. All you can do is get a negative response from a merging script. Then you fix your errors and tell the script to try again. Nobody sees these attempts, and you don't need to apologize. Fear factor is gone.

BTW, try to use rultor.com to enforce this read-only master branch principle in your project.

© Yegor Bugayenko 2014–2018

Stop Chatting, Start Coding

QR code

Stop Chatting, Start Coding

  • comments
badge

The first principle of eXtremely Distributed Software Development (XDSD) states that "everyone gets paid for verified deliverables." This literally means that, in order to get paid, every programmer has to write the code, commit it to the repository, pass a code review and make sure the code is merged into the destination branch. Only then, is his result appreciated and paid for.

Barton Fink (1991) by Joel Coen
Barton Fink (1991) by Joel Coen

For most of my clients this already sounds extreme. They are used to a traditional scheme of paying per hour or per month. They immediately realize the benefits of XDSD, though, because for them this approach means that project funds are not wasted on activities that don't produce results.

But that's not all.

This principle also means that nobody is paid for anything except tasks explicitly assigned to him/her. Thus, when a programmer has a question about current design, specification, configuration, etc.---nobody will be interested in answering it. Why not? Because there is no payment attached to this. Answering questions in Skype, Slack, or HipChat, or by email is something that is not appreciated in XDSD in any way. The project simply doesn't pay for this activity. That's why none of our programmers do this.

More about this philosophy here: It's Not a School!

We don't use any (I mean it!) informal communication channels in XDSD projects. We don't do meetings or conference calls. We never discuss any technical issues on Skype or by phone.

So, how do we resolve problems and share information?

We use task tracking systems for that. When a developer has a question, he submits it as a new "ticket." The project manager then picks it up and assigns it to another developer, who is able to answer it. Then, the answer goes back through the tracking system or directly into the source code.

The "question ticket" gets closed when its author is satisfied with the answer. When the ticket is closed, those who answered it get paid.

Using this model, we significantly improve project communications, by making them clean and transparent. We also save a lot of project funds, since every hour spent by a team member is traceable to the line of code he produced.

You can see how this happens in action, for example, in this ticket (the project is open source; that's why all communications are open): jcabi/jcabi-github#731. One Java developer is having a problem with his Git repository. Apparently he did something wrong and couldn't solve the problem by himself. He asked for help by submitting a new bug to the project. He was paid for the bug report. Then, another team member was assigned to help him. He did, through a number of suggestions and instructions. In the end, the problem was solved, and he was also paid for the solution. In total, the project spent 45 minutes, and the problem was solved.

© Yegor Bugayenko 2014–2018

Project Lifecycle in Zerocracy

QR code

Project Lifecycle in Zerocracy

  • comments

In addition to being a hands-on programmer, I'm also co-founder and CTO of Zerocracy, a custom software development company. I play the role of a technical and management leader in all projects we work with.

badge

I wrote this article for those who're interested in hiring me and/or my team. This article will demonstrate what happens from day one until the end of the project, when you choose to work with us.

You will see below that our methods of software development seriously differ from what many other teams are using. I personally pay a lot of attention to quality of code and quality of the internal processes that connect our team.

There are four phases in every project I work with in Zerocracy:

  • Thinking. Here we're trying to understand: What is the problem that the product is going to solve? We're also investigating the product's boundaries---who will work with the software (actors) and how will they work with it (user stories). Deliverables: specification. Duration: from 2 days up to 3 weeks. Participants: product owner, analyst(s), architect, project manager.

  • Building. Here the software architect is creating a proof-of-concept (aka an MVP or prototype or a skeleton). It is a one-man job that is done almost without any interaction with anyone else. The architect builds the product according to the specification in a very limited time frame. The result will have multiple bugs and open ends, but it will implement the main user story. The architect also configures continuous integration and delivery pipelines. Deliverables: working software. Duration: 2-5 days. Participants: architect.

  • Fixing. At this phase we are adding all the meat to the skeleton. This phase takes most of the time and budget and involves many participants. In some projects we invite up to 50 people to work, at the same time. Since we treat all inconsistencies as bugs, this phase is mostly about finding, reporting and fixing bugs, in order to stabilize the product and get it ready for market launch. We increment and release the software multiple times a day, preferably to its user champions. Deliverables: bug fixes via pull requests. Duration: from weeks to months. Participants: programmer(s), designer(s), tester(s), code reviewer(s), architect, project manager.

  • Using. At this final phase we are launching the product to its end-users, and collecting their feedback (both positive and negative). Everything they are reporting back to us is being registered as a bug. Then, we categorize the bugs and fix them. This phase may take years, but it never involves active implementation of new features. Deliverables: bug fixes via pull requests. Duration: months. Participants: programmer(s), code reviewer(s), project manager.

The biggest (i.e., longest and most expensive) phase is, of course, Fixing. It usually takes the majority of time (over 70%). However, the most important and risky phase is the first one---Thinking. A mistake made during Thinking will cost much more than a mistake made later.

Thinking

badge

Thinking is the first and the most important phase.

First, we give a name to the project and create a GitHub repository. We try to keep all our projects (both open source and commercial) in GitHub. Mostly because the platform is very popular, very powerful, and really cheap ($7/mo for a set of 5 private projects). We also keep all communication in the GitHub issue tracker.

Then, we create a simple half-page SRS document (Software Requirements Specification). Usually this is done right inside the source code, but sometimes in the GitHub README.md file. What's important is that the document should be under version control. We will modify it during the course of the project, very intensively. The README.md should briefly identify main "actors" of the system and define the product scope.

Even though it is only half a page, the creation of this initial SRS document is the most important and the most expensive task in the entire project. We pay a lot of attention to this step. Usually this document is written by one of our system analysts in direct communication with the project sponsor. We can't afford a mistake at this step.

Then, we invite a few new system analysts to the project. These guys are responsible for turning our initial README into a more complete and detailed specification. They start by asking questions, submitting them one by one as GitHub issues. Every question is addressed to the product owner. Using his/her answers, system analysts modify the README document. Sometimes we're using Requs.

At the end of the Thinking phase we estimate the size of the project, in Hits of Code. Using this HoC metric, we can roughly estimate a budget.

Building

badge

This is a one-man job for an architect. Every project we work on has an architect who is personally responsible for the quality and technical decisions. We have a few brilliant engineers for this role.

The Building phase is rather straight forward. The architect has to implement the solution according to the README, in a few working days. No matter how big the idea and how massive the planning development, the architect still has to create (build from scratch!) the product in, say, three days.

Besides building the software itself, the architect has to configure all basic DevOps processes, including: 1) automated testing and quality control, 2) deploying and releasing pipelines, 3) repository of artifacts, 4) continuous integration service, etc.

The result of this phase is a working software package, deployable to its destination and available for testers. Technical quality requirements are also defined at this phase.

More about the Building phase here: Nine Steps to Start a Software Project

Fixing

badge

Now it's time to build a distributed team of programmers. First, we invite those who've worked on other projects and have already have proven their quality. Very often we invite new people, finding them through StackOverflow, GitHub, Upwork, and other sources. An average team size of an average project is 15-25 programmers.

At this phase, we understand any inconsistency as a bug. If something is not clear in the documentation, or if something can be refactored for better readability, or if a function can be improved for higher performance---it is a bug to us. And bugs are welcome in our projects. We encourage everybody to report as many bugs as possible. This is how we achieve high quality.

That is why the phase is called Fixing, after all. We are reporting bugs and fixing them. Hundreds of bugs. Sometimes thousands. The product grows in front of our very eyes, because after every bug fix we re-deploy the entire product to the production platform.

Every bug is reported, classified, discussed, and fixed in its own GitHub ticket and its own Git branch. We never allow anyone to just commit to the master branch---all changes must pass through our quality controls and be merged into master by rultor.com, our merging bot.

Also important to mention is that all communications with the product owner and between programmers happen only through GitHub issues. We never use any chats, Skype, emails or conferencing software. We communicate only through tickets and comments in GitHub.

Using

badge

This is the final phase and it can take quite a long time. By now, the product is ready and is launched to the market. But we still receive bug reports and feature request from the product owner, and we still fix them through the same process flow as in the Fixing phase.

We try to keep this phase as quiet as possible, in terms of the amount of bugs reported and fixed. Thanks to our intensive and pro-active bug finding and fixing in the previous phase, we usually have very few problems at the Using phase.

And big feature requests? At this phase, we usually try to convert them into new projects and develop them separately, starting again from Thinking.

BTW, the illustrations you see above are made by Bárbara Lopes.

© Yegor Bugayenko 2014–2018

Best Hosted Continuous Integration Services for a Private Repository

QR code

Best Hosted Continuous Integration Services for a Private Repository

  • comments

Every project I'm working with starts with a setup of continuous integration pipeline. I'm a big fan of cloud services, that's why I was always using Travis. A few of my clients questioned this choice recently, mostly because of the price. So I decided to make a brief analysis of the market.

I configured Rultor, an open source project, in every CI service I managed to find. All of them are free for open source projects. All of them are hosted and do not require any server installation Here they are, in order of my personal preference (first four are the best and highly recommended):

Minimum price
Linux1
Windows2
MacOS3
Pull requests4
Log compress5
Docker6
Build
Shippablefree (!) badge
Travis$69/mo badge
AppVeyor$39/mo?--- badge
Wercker$350/mo badge
SemaphoreApp$29/mo
Snap-CI$30/mo?
Codeship$49/mo?
CircleCI$19/mo
SonoLabs$15/mo?
DeployBot$15/mo??
Vexor¢90/hr??
GreenHouseCI$49/mo??

If you know any other good continuous integration services, email me, I'll review and add them to this list. BTW, here is a "full" list of continuous integration software and services.

By the way, a few platforms from this list contacted me and asked to review them again. Some even offered me money to put them higher in the list (kidding). Anyway, it's up to you to decide whether it's a good sign or not (I think it's good, they care about their PR). I marked them in the list with ☺ emoji.

Best Four

badge

Shippable was easy to configure since it understands .travis.yml out of the box. The user interface is easy to navigate since it doesn't have "settings" page at all (or I didn't find it). Everything is configured via shippable.yml file in the repository. The service looks stable and robust, no complains so far. What is especially cool about them is that they allow you to build in a Docker container.

badge

Travis is the best platform I've seen so far. Mostly because it is the most popular. Perfectly integrates with GitHub and has proper documentation. One important downside is the price of $129 per month. "With this money you can get a dedicated EC2 instance and install Jenkins there"---some of my clients say. I strongly disagree, since Jenkins will require a 24x7 administration, which costs way more than $129, but it's always difficult to explain.

badge

Wercker is a European product from Amsterdam, which is still in beta and that's why free for all projects. The platform looks very promising. It is still free for private repositories and is backed up by investments. They also have an interesting concept of build "boxes," which can be pre-configured similar to Docker containers. It works rather stable for the last few months, no complains so far.

badge

AppVeyor is the only one that runs Windows builds. Even though I'm working mostly with Java and Ruby, which are expected to be platform independent, they very often appear to be exactly the opposite. When your build succeeds on Linux, there is almost no guarantee it will pass on Windows or Mac. I'm planning to use AppVeyor in every project, in combination with some other CI service. Here is how I integrate Maven builds with AppVeyor.

Others

badge

SemaphoreApp is easy to configure and work with. It makes an impression of a light-weight system, which I generally appreciate. As a downside, they don't have any Maven pre-installed have an old version of Maven, but this was solved easily with a short custom script that downloads and unpacks the latest Maven. Another downside is that they are not configurable through a file (like .travis.yml)---you should do everything through a UI. They also support caching between builds.

badge

Snap-CI is a product of ThoughtWorks, an author of Go, an open source continuous integration server. It looks a bit more complicated than others, giving you an ability to define "stages" and combine them into pipelines. I'm not sure yet how these mechanisms may help in small and medium size projects we're mostly working with, but they look "cool." There is also a very unfortunate limitation of 2Gb RAM per build---some of my Java projects fail because of that. Besides that, they don't give full access to the build server, for example we can't modify anything in /etc---it is a show-stopper for us.

Codeship works fine, but their web UI looks a bit out-dated. Besides that, they promise to work with pull requests, but I didn't manage to configure them. They simply don't notify our pull requests in GitHub, even though they build them. Maybe I'll find a way, so far it's not clear.

badge

CircleCI I still don't know why my build fails there. Really difficult to configure and understand what's going on. Trying to figure it out...

badge

SolanoLabs looks rather immature and difficult to configure. They don't even support automatic GitHub hook configuration when new repository is added. However, their sales spams me rather aggressively :)

badge

Hosted-CI is for iOS/OSX only. They don't give anything for free, even for open source projects. I didn't have a chance to test them yet.

badge

CloudBees is basically a hosted Jenkins. I don't really like Jenkins, that's why can't recommend this platform.

badge

DeployBot doesn't even allow me to login via GitHub, huh? They seem to be more "deployment" oriented, not just continuous integration.

badge

Vexor looks nice and offers a rather unique billing model---they charge per build, not per month. I would definitely recommend to give it a try. I couldn't make it work though...

badge

GreenHouseCI is a CI platform for mobile apps (iOS, Android, etc.) Seems to be interesting, I just don't have a full scale mobile app to test it against.


gitlab-ci: will review soon

coverity.com: will review it soon and add to the list.

buddy.works: will review soon.

AWS Code Build and AWS Code Deploy: will review soon.

Ship.io is dead (as of 20-Sep-2016).

ZeroCI.com is dead (as of 28-Aug-2016).

Drone.io is not hosted any more, but is open source (as of 23-Jan-2017).

Hosted-ci doesn't look alive (as of 16-Apr-2017).

Magnum-CI.com doesn't look alive (as of 30-May-2017).


BTW, if you don't like the idea of keeping continuous integration in cloud, consider these on-premise software packages (in order or preference): Jenkins, TeamCity, Go, Strider, BuildBot.

Keep in mind that no matter how good and expensive your continuous integration service is, it won't help you unless you make your master branch read-only.


1 This means that the platform can build your repo in Linux environment. Almost all of them do that by default, unless you configure them otherwise.
2 Some of them can build on Windows platform.
3 MacOS support means that an Objective-C/Swift product can be built there.
4 I mean GitHub pull request support here. Some of them can be integrated with GitHub and will build pull requests before they are merged. Build status will be visible in GitHub. A pretty convenient feature.
5 Logs compression is a critical feature, at least for me. Most of my logs are from Maven and without col -b they look too long and unreadable.
6 Looks like Docker containers must be supported by all of them, but unfortunately it's not the case. Ideally, all builds should run in containers.

© Yegor Bugayenko 2014–2018

Dependency Injection Containers are Code Polluters

QR code

Dependency Injection Containers are Code Polluters

  • comments

While dependency injection (aka, "DI") is a natural technique of composing objects in OOP (known long before the term was introduced by Martin Fowler), Spring IoC, Google Guice, Java EE6 CDI, Dagger and other DI frameworks turn it into an anti-pattern.

I'm not going to discuss obvious arguments against "setter injections" (like in Spring IoC) and "field injections" (like in PicoContainer). These mechanisms simply violate basic principles of object-oriented programming and encourage us to create incomplete, mutable objects, that get stuffed with data during the course of application execution. Remember: ideal objects must be immutable and may not contain setters.

Instead, let's talk about "constructor injection" (like in Google Guice) and its use with dependency injection containers. I'll try to show why I consider these containers a redundancy, at least.

What is Dependency Injection?

This is what dependency injection is (not really different from a plain old object composition):

public class Budget {
  private final DB db;
  public Budget(DB data) {
    this.db = data;
  }
  public long total() {
    return this.db.cell(
      "SELECT SUM(cost) FROM ledger"
    );
  }
}

The object data is called a "dependency."

A Budget doesn't know what kind of database it is working with. All it needs from the database is its ability to fetch a cell, using an arbitrary SQL query, via method cell(). We can instantiate a Budget with a PostgreSQL implementation of the DB interface, for example:

public class App {
  public static void main(String... args) {
    Budget budget = new Budget(
      new Postgres("jdbc:postgresql:5740/main")
    );
    System.out.println("Total is: " + budget.total());
  }
}

In other words, we're "injecting" a dependency into a new object budget.

An alternative to this "dependency injection" approach would be to let Budget decide what database it wants to work with:

public class Budget {
  private final DB db =
    new Postgres("jdbc:postgresql:5740/main");
  // class methods
}

This is very dirty and leads to 1) code duplication, 2) inability to reuse, and 3) inability to test, etc. No need to discuss why. It's obvious.

Thus, dependency injection via a constructor is an amazing technique. Well, not even a technique, really. More like a feature of Java and all other object-oriented languages. It's expected that almost any object will want to encapsulate some knowledge (aka, a "state"). That's what constructors are for.

What is a DI Container?

So far so good, but here comes the dark side---a dependency injection container. Here is how it works (let's use Google Guice as an example):

import javax.inject.Inject;
public class Budget {
  private final DB db;
  @Inject
  public Budget(DB data) {
    this.db = data;
  }
  // same methods as above
}

Pay attention: the constructor is annotated with @Inject.

Then, we're supposed to configure a container somewhere, when the application starts:

Injector injector = Guice.createInjector(
  new AbstractModule() {
    @Override
    public void configure() {
      this.bind(DB.class).toInstance(
        new Postgres("jdbc:postgresql:5740/main")
      );
    }
  }
);

Some frameworks even allow us to configure the injector in an XML file.

From now on, we are not allowed to instantiate Budget through the new operator, like we did before. Instead, we should use the injector we just created:

public class App {
  public static void main(String... args) {
    Injection injector = // as we just did in the previous snippet
    Budget budget = injector.getInstance(Budget.class);
    System.out.println("Total is: " + budget.total());
  }
}

The injection automatically finds out that in order to instantiate a Budget it has to provide an argument for its constructor. It will use an instance of class Postgres, which we instantiated in the injector.

This is the right and recommended way to use Guice. There are a few even darker patterns, though, which are possible but not recommended. For example, you can make your injector a singleton and use it right inside the Budget class. These mechanisms are considered wrong even by DI container makers, however, so let's ignore them and focus on the recommended scenario.

What Is This For?

Let me reiterate and summarize the scenarios of incorrect usage of dependency injection containers:

  • Field injection

  • Setter injection

  • Passing injector as a dependency

  • Making injector a global singleton

If we put all of them aside, all we have left is the constructor injection explained above. And how does that help us? Why do we need it? Why can't we use plain old new in the main class of the application?

The container we created simply adds more lines to the code base, or even more files, if we use XML. And it doesn't add anything, except an additional complexity. We should always remember this if we have the question: "What database is used as an argument of a Budget?"

The Right Way

Now, let me show you a real life example of using new to construct an application. This is how we create a "thinking engine" in rultor.com (full class is in Agents.java):

Impressive? This is a true object composition. I believe this is how a proper object-oriented application should be instantiated.

And DI containers? In my opinion, they just add unnecessary noise.

ps. Another live example is here:

© Yegor Bugayenko 2014–2018

Why Monetary Awards Don't Work?

QR code

Why Monetary Awards Don't Work?

  • comments

Monetary rewards for employees. Do they work? Should we use them? Can money motivate creative minds? Will a programmer work better if he gets paid only when he reaches his goals and objectives?

Much research has already been done on this subject, and most of it proves that connecting results with money is a very demotivating approach. For example, Ian Larkin says that the most productive workers "suffered a 6-8% decrease in productivity after the award was instituted."

I believe this is completely true. Money may become a terrible de-motivator for all modern employees (not just programmers).

My question is---why is this so?

Why doesn't money work, even when it was invented to be a universal instrument to measure our labor? Why can't an American dollar, which has been used for centuries as a trading tool between working people, be used anymore?

Why, in a modern office, do we try to hide monetary motivation and replace it with everything else, like free lunches, team building events, paid vacations, etc. Why don't we just say---"Jeff completed his task faster than everybody else. This is his $500 check. Whoever completes the next task gets $300," out loud in the office?... Sounds uncomfortable, doesn't it?

Why does money as a motivator scare us?

I have an answer.

Money doesn't work when there are no ground rules.

When we say that Jeff will get a $500 bonus if he finishes his task on time, but don't say what he should do when someone distracts him---Jeff gets frustrated. He also doesn't understand who his boss is anymore. Does he just work for the bonus, or should he also satisfy a CTO who comes to his desk asking him to do something else urgently? Is Jeff allowed to tell the CTO "to get lost" because he's working towards his own personal objective (the bonus)?

In all cases I've seen myself and in all research cases I've read about, people keep repeating the same mistake. They create a rewards program (monetary or not) without setting ground rules for the team. By doing so, they encourage people to play wild-wild west, where the fastest gets the cash bag. Obviously, the Bad and the Ugly get to the prize faster, while the Good gets demotivated and depressed.

The figure

In a clockwise direction from the top left corner: The Good, the Bad and the Ugly (1966) by Sergio Leone; Roger Federer; A Serious Man (2009) by Ethan Coen and Joel Coen; Two and a Half Men (TV Series).

What do I mean by ground rules?

It should be a simple document (PMBOK calls it a Staffing Management Plan) that helps me, as a team member, answer at least these basic questions:

  • How my personal results are measured?
  • Who gives me tasks and who do I report to?
  • How should I resolve conflicts between tasks?
  • What are my personal deadlines for every task?
  • Do I have measurable quality expectations for my deliverables?
  • How do my mistakes affect my performance grade?

The ground rules document should be superior to your boss. If the document says that your results get an A+ grade, the boss should have no say. If she doesn't like you personally, it doesn't matter. You get an A+ grade, and you are the best. That's it.

Does your team have such a document? Can you answer all of these questions? If not, you're not ready for a rewards program. It will only make your management situation worse, just like all the scientific research says. Rewards will motivate the most cunning to take advantage of the most hard working and good-natured. Team spirit will suffer, big time.

On the other hand, if you have that "ground rules" document and you strictly follow it, giving monetary rewards to your workers will significantly increase their performance and motivation. They will know exactly what needs to be done to get the rewards, and they won't have any distraction. Your team won't be a group of wild west gunslingers anymore, but more like players in a sports arena. The best players will go further, and the worst will know exactly what needs to be done to improve. Fair competition will lead to a better cumulative result.

Moreover, if your ground rules are strict and explicit, you can use not only rewards, but also punishments. And your team will gladly accept them, because they will help emphasize what (and who) works best and help get rid of the waste.

I'm speaking from experience here. In XDSD we're not only rewarding programmers with money, but we also never pay for anything except delivered results. We manage to do this mostly because our ground rules are very strict and non-ambiguous. And we never break them.

© Yegor Bugayenko 2014–2018

Built-in Fake Objects

QR code

Built-in Fake Objects

  • comments

While mock objects are perfect instruments for unit testing, mocking through mock frameworks may turn your unit tests into an unmaintainable mess. Thanks to them we often hear that "mocking is bad" and "mocking is evil."

The root cause of this complexity is that our objects are too big. They have many methods and these methods return other objects, which also have methods. When we pass a mock version of such an object as a parameter, we should make sure that all of its methods return valid objects.

This leads to inevitable complexity, which turns unit tests to waste almost impossible to maintain.

Object Hierarchy

Take the Region interface from jcabi-dynamo as an example (this snippet and all others in this article are simplified, for the sake of brevity):

public interface Region {
  Table table(String name);
}

Its table() method returns an instance of the Table interface, which has its own methods:

public interface Table {
  Frame frame();
  Item put(Attributes attrs);
  Region region();
}

Interface Frame, returned by the frame() method, also has its own methods. And so on. In order to create a properly mocked instance of interface Region, one would normally create a dozen other mock objects. With Mockito it will look like this:

public void testMe() {
  // many more lines here...
  Frame frame = Mockito.mock(Frame.class);
  Mockito.doReturn(...).when(frame).iterator();
  Table table = Mockito.mock(Table.class);
  Mockito.doReturn(frame).when(table).frame();
  Region region = Mockito.mock(Region.class);
  Mockito.doReturn(table).when(region).table(Mockito.anyString());
}

And all of this is just a scaffolding before the actual testing.

Sample Use Case

Let's say, you're developing a project that uses jcabi-dynamo for managing data in DynamoDB. Your class may look similar to this:

public class Employee {
  private final String name;
  private final Region region;
  public Employee(String empl, Region dynamo) {
    this.name = empl;
    this.region = dynamo;
  }
  public Integer salary() {
    return Integer.parseInt(
      this.region
        .table("employees")
        .frame()
        .where("name", this.name)
        .iterator()
        .next()
        .get("salary")
        .getN()
    );
  }
}

You can imagine how difficult it will be to unit test this class, using Mockito, for example. First, we have to mock the Region interface. Then, we have to mock a Table interface and make sure it is returned by the table() method. Then, we have to mock a Frame interface, etc.

The unit test will be much longer than the class itself. Besides that, its real purpose, which is to test the retrieval of an employee's salary, will not be obvious to the reader.

Moreover, when we need to test a similar method of a similar class, we will need to restart this mocking from scratch. Again, multiple lines of code, which will look very similar to what we have already written.

Fake Classes

The solution is to create fake classes and ship them together with real classes. This is what jcabi-dynamo is doing. Just look at its JavaDoc. There is a package called com.jcabi.dynamo.mock that contains only fake classes, suitable only for unit testing.

Even though their sole purpose is to optimize unit testing, we ship them together with production code, in the same JAR package.

This is what a test will look like, when a fake class MkRegion is used:

public class EmployeeTest {
  public void canFetchSalaryFromDynamoDb() {
    Region region = new MkRegion(
      new H2Data().with(
        "employees", new String[] {"name"},
        new String[] {"salary"}
      )
    );
    region.table("employees").put(
      new Attributes()
        .with("name", "Jeff")
        .with("salary", new AttributeValue().withN(50000))
    );
    Employee emp = new Employee("Jeff", region);
    assertThat(emp.salary(), equalTo(50000));
  }
}

This test looks obvious to me. First, we create a fake DynamoDB region, which works on top of H2Data storage (in-memory H2 database). The storage will be ready for a single employees table with a hash key name and a single salary attribute.

Then, we put a record into the table, with a hash Jeff and a salary 50000.

Finally, we create an instance of class Employee and check how it fetches the salary from DynamoDB.

I'm currently doing the same thing in almost every open source library I'm working with. I'm creating a collection of fake classes, that simplify testing inside the library and for its users.

BTW, a great article on the same subject: tl;dw: Stop mocking, start testing by Ned Batchelder.

PS. Check this out, on a very similar subject: Mocking of HTTP Server in Java.

© Yegor Bugayenko 2014–2018

Remote Programming in Teamed.io

QR code

Remote Programming in Teamed.io

  • comments

Here is an interview taken by Lisette Sutherland from www.CollaborationSuperpowers.com, a few hours ago, which I enjoyed to give :)

I answered these questions (approximately):

  • How Teamed.io differs from other software companies (0:50)?

  • How do we control programmers remotely (1:59)?

  • Do we compare ourselves with open source (3:52)?

  • How do we build a network of programmers (5:10)?

  • Why people like to work with us (5:40)?

  • What happens when a programmer fails (7:50)?

  • How can it be financially successful (9:40)?

  • How do we organize "team building" (11:50)?

  • What challenges do we have (14:50)?

  • What about micro-management (17:55)?

  • Can this work in a non-IT sector (19:40)?

  • What do you do to manage the team (20:48)?

  • Isn't it difficult to manage so many tasks (24:18)?

  • Do we have cultural issues (25:35)?

  • Is it true that people are not enough result-oriented (27:40)?

  • Are there any other challenges (29:12)?

  • What do I like personally about it (30:40)?

  • How do we scale our teams when we need more programmers (32:01)?

  • What an "unlimited pool of talents" means (34:40)?

  • What advice do I have for those who work remotely (37:50)?

  • Where do I work from, personally (39:10)?

  • How do we find clients (42:29)?

Enjoy :)

© Yegor Bugayenko 2014–2018

Getters/Setters. Evil. Period.

QR code

Getters/Setters. Evil. Period.

  • comments

There is an old debate, started in 2003 by Allen Holub in this Why getter and setter methods are evil famous article, about whether getters/setters is an anti-pattern and should be avoided or if it is something we inevitably need in object-oriented programming. I'll try to add my two cents to this discussion.

The gist of the following text is this: getters and setters is a terrible practice and those who use it can't be excused. Again, to avoid any misunderstanding, I'm not saying that get/set should be avoided when possible. No. I'm saying that you should never have them near your code.

A Fish Called Wanda (1988) by Charles Crichton
A Fish Called Wanda (1988) by Charles Crichton

Arrogant enough to catch your attention? You've been using that get/set pattern for 15 years and you're a respected Java architect? And you don't want to hear that nonsense from a stranger? Well, I understand your feelings. I felt almost the same when I stumbled upon Object Thinking by David West, the best book about object-oriented programming I've read so far. So please. Calm down and try to understand while I try to explain.

Existing Arguments

badge

There are a few arguments against "accessors" (another name for getters and setters), in an object-oriented world. All of them, I think, are not strong enough. Let's briefly go through them.

Tell, Don't Ask Allen Holub says, "Don't ask for the information you need to do the work; ask the object that has the information to do the work for you."

Violated Encapsulation Principle An object can be teared apart by other objects, since they are able to inject any new data into it, through setters. The object simply can't encapsulate its own state safely enough, since anyone can alter it.

Exposed Implementation Details If we can get an object out of another object, we are relying too much on the first object's implementation details. If tomorrow it will change, say, the type of that result, we have to change our code as well.

All these justifications are reasonable, but they are missing the main point.

Fundamental Disbelief

Most programmers believe that an object is a data structure with methods. I'm quoting Getters and Setters Are Not Evil, an article by Bozhidar Bozhanov:

But the majority of objects for which people generate getters and setters are simple data holders.

This misconception is the consequence of a huge misunderstanding! Objects are not "simple data holders." Objects are not data structures with attached methods. This "data holder" concept came to object-oriented programming from procedural languages, especially C and COBOL. I'll say it again: an object is not a set of data elements and functions that manipulate them. An object is not a data entity.

What is it then?

A Ball and A Dog

In true object-oriented programming, objects are living creatures, like you and me. They are living organisms, with their own behavior, properties and a life cycle.

Can a living organism have a setter? Can you "set" a ball to a dog? Not really. But that is exactly what the following piece of software is doing:

Dog dog = new Dog();
dog.setBall(new Ball());

How does that sound?

Can you get a ball from a dog? Well, you probably can, if she ate it and you're doing surgery. In that case, yes, we can "get" a ball from a dog. This is what I'm talking about:

Dog dog = new Dog();
Ball ball = dog.getBall();

Or an even more ridiculous example:

Dog dog = new Dog();
dog.setWeight("23kg");

Can you imagine this transaction in the real world? :)

Does it look similar to what you're writing every day? If yes, then you're a procedural programmer. Admit it. And this is what David West has to say about it, on page 30 of his book:

Step one in the transformation of a successful procedural developer into a successful object developer is a lobotomy.

Do you need a lobotomy? Well, I definitely needed one and received it, while reading West's Object Thinking.

Object Thinking

Start thinking like an object and you will immediately rename those methods. This is what you will probably get:

Dog dog = new Dog();
dog.take(new Ball());
Ball ball = dog.give();

Now, we're treating the dog as a real animal, who can take a ball from us and can give it back, when we ask. Worth mentioning is that the dog can't give NULL back. Dogs simply don't know what NULL is :) Object thinking immediately eliminates NULL references from your code.

Besides that, object thinking will lead to object immutability, like in the "weight of the dog" example. You would re-write that like this instead:

Dog dog = new Dog("23kg");
int weight = dog.weight();

The dog is an immutable living organism, which doesn't allow anyone from the outside to change her weight, or size, or name, etc. She can tell, on request, her weight or name. There is nothing wrong with public methods that demonstrate requests for certain "insides" of an object. But these methods are not "getters" and they should never have the "get" prefix. We're not "getting" anything from the dog. We're not getting her name. We're asking her to tell us her name. See the difference?

We're not talking semantics here, either. We are differentiating the procedural programming mindset from an object-oriented one. In procedural programming, we're working with data, manipulating them, getting, setting, and deleting when necessary. We're in charge, and the data is just a passive component. The dog is nothing to us---it's just a "data holder." It doesn't have its own life. We are free to get whatever is necessary from it and set any data into it. This is how C, COBOL, Pascal and many other procedural languages work(ed).

On the contrary, in a true object-oriented world, we treat objects like living organisms, with their own date of birth and a moment of death---with their own identity and habits, if you wish. We can ask a dog to give us some piece of data (for example, her weight), and she may return us that information. But we always remember that the dog is an active component. She decides what will happen after our request.

That's why, it is conceptually incorrect to have any methods starting with set or get in an object. And it's not about breaking encapsulation, like many people argue. It is whether you're thinking like an object or you're still writing COBOL in Java syntax.

PS. Yes, you may ask,---what about JavaBeans, JPA, JAXB, and many other Java API-s that rely on the get/set notation? What about Ruby's built-in feature that simplifies the creation of accessors? Well, all of that is our misfortune. It is much easier to stay in a primitive world of procedural COBOL than to truly understand and appreciate the beautiful world of true objects.

PPS. Forgot to say, yes, dependency injection via setters is also a terrible anti-pattern. About it, in one of the next posts :)

PPPS. Here is what I'm suggesting to use instead of getters: printers.

© Yegor Bugayenko 2014–2018

Deploying to Heroku, in One Click

QR code

Deploying to Heroku, in One Click

  • comments

There were a few articles already about our usage of Rultor for automating continuous delivery cycles of Java and Ruby projects, including RubyGems, CloudBees and Maven Central.

This one describes how Heroku deployment can be automated. When I need to deploy a new version of an Aintshy web application, all I do is create one message in a GitHub ticket. I just say @rultor release 0.1.4 and version 0.1.4 gets deployed to Heroku. See GitHub ticket #5.

You can do the same, with the help of Rultor.com, a free hosted DevOps assistant.

Create Heroku Project

Create a new project at Heroku.com.

Then install their command line tool-belt.

Authenticate at Heroku

You should authenticate your public SSH key at Heroku, using their command line tool-belt. The process is explained here, but it is not much of a process. You just run heroku login and enter your login credentials. As a result, you will get your existing key (located at ~/.ssh/id_rsa.pub) authenticated by Heroku.

If you didn't have the key before, it will be created automatically.

Encrypt SSH Key

Now, encrypt id_rsa and id_rsa.pub (they are in the ~/.ssh directory) with a rultor remote:

$ gem install rultor
$ rultor encrypt -p me/test id_rsa
$ rultor encrypt -p me/test id_rsa.pub

Instead of me/test use the name of your GitHub project.

You will get two new files id_rsa.asc and id_rsa.pub.asc. Add them to the root directory of your project, commit and push. These files contain your secret information, but only the Rultor server can decrypt them.

Create Rultor Configuration

Create a .rultor.yml file in the root directory of your project (reference page explains this format in detail):

decrypt:
  id_rsa: "repo/id_rsa.asc"
  id_rsa.pub: "repo/id_rsa.pub.asc"
release:
  script: |
    mvn versions:set "-DnewVersion=${tag}"
    git commit -am "${tag}"
    mvn clean install -Pqulice --errors
    git remote add heroku git@heroku.com:aintshy.git
    mkdir ~/.ssh
    mv ../id_rsa ../id_rsa.pub ~/.ssh
    chmod -R 600 ~/.ssh/*
    echo -e \
      "Host *\n  StrictHostKeyChecking no\n  UserKnownHostsFile=/dev/null" \
      > ~/.ssh/config
    git push -f heroku $(git symbolic-ref --short HEAD):master

You can compare your file with live Rultor configuration of aintshy/hub.

Run It!

badge

Now it's time to see how it all works. Create a new ticket in the GitHub issue tracker, and post something like this into it (read more about Rultor commands):

@rultor release, tag is `0.1`

You will get a response in a few seconds. The rest will be done by Rultor.

Enjoy :)

BTW, if something doesn't work as I've explained, don't hesitate to submit a ticket to the Rultor issue tracker. I will try to help you.

PS. I would also recommend to versionalize artifacts through MANIFEST.MF and use jcabi-manifests to read them later.

© Yegor Bugayenko 2014–2018

Deployment Script vs. Rultor

QR code

Deployment Script vs. Rultor

  • comments
badge

When I explain how Rultor automates deployment/release processes, very often I hear something like:

But I already have a script that deploys everything automatically.

This response is very common, so I decided to summarize my three main arguments for automated Rultor deployment/release processes in one article: 1) isolated docker containers, 2) visibility of logs and 3) security of credentials.

Read about them and see what Rultor gives you on top of your existing deployment script(s).

Charlie and the Chocolate Factory (2005) by Tim Burton
Charlie and the Chocolate Factory (2005) by Tim Burton

Before we start with the arguments, let me emphasize that Rultor is a useful interface to your custom scripts. When you decide to automate deployment with Rultor, you don't throw away any of your existing scripts. You just teach Rultor how to call them.

Isolated Docker Containers

The first advantage you get once you start calling your deployment scripts from Rultor is the usage of Docker. I'm sure you know what Docker is, but for those who don't---it is a manager of virtual Linux "machines." It's a command line script that you call when you need to run some script in a new virtual machine (aka "container"). Docker starts the container almost immediately and runs your script. The beauty of Docker is that every container is a perfectly isolated Linux environment, with its own file system, memory, processes, etc.

When you tell Rultor to run your deployment script, it starts a new Docker container and runs your script there. But what benefit does this give me, you ask?

The main benefit is that the container gets destroyed right after your script is done. This means that you can do all pre-configuration inside the container without any fear of conflict with your main working platform. Let me give an example.

I'm developing on MacBook, where I install and remove packages which I need for development. At the same time, I have a project that, in order to be deployed, requires PHP 5.3, MySQL 5.6, Phing, PHPUnit, PHPCS and xdebug. Every MacOS version needs to be configured specifically to get these applications up and running, and it's a time-consuming job.

I can change laptops, and I can change MacOS versions, but the project stays the same. It still requires the same set of packages in order to run its deployment script successfully. And the project is not in active development any more. I simply don't need these packages for my day-to-day work, since I'm working with Java more now. But, when I need to make a minor fix to that PHP project and deploy it, I have to install all the required PHP packages and configure them. Only after that can I deploy that minor fix.

It is annoying, to say the least.

Docker gives me the ability to automate all of this together. My existing deployment script will get a preamble, which will install and configure all necessary PHP-related packages in a clean Ubuntu container. This preamble will be executed on every run of my deployment script, inside a Docker container. For example, it may look like this:

My deployment script looked like this before I started to use Rultor:

#!/bin/bash
phing test
git ftp push --user ".." \
  --passwd ".." \
  --syncroot php/src \
  ftp://ftp.example.com/

Just two lines. The first one is a full run of unit tests. The second one is an FTP deployment to the production server. Very simple. But this script will only work if PHP 5.3, MySQL, Phing, xdebug, PHPCS and PHPUnit are installed. Again, it's a lot of work to install and configure them every time I upgrade my MacOS or change a laptop.

Needless to say, that if/when someone joins the project and tries to run my scripts, he/she will have to do this pre-installation work again.

So, here is a new script, which I'm using now. It is being executed inside a new Docker container, every time:

#!/bin/bash
# First, we install all prerequisites
sudo apt-get install -y php5 php5-mysql mysql
sudo apt-get install php-pear
sudo pear channel-discover pear.phpunit.de
sudo pear install phpunit/PHPUnit
sudo pear install PHP_CodeSniffer
sudo pecl install xdebug
sudo pear channel-discover pear.phing.info
sudo pear install phing/phing
# And now the same script I had before
phing test
git ftp push --user ".." \
  --passwd ".." \
  --syncroot php/src \
  ftp://ftp.example.com/

Obviously, running this script on my MacBook (without virtualization) would cause a lot of trouble. Well, I don't even have apt-get here :)

Thus, the first benefit that Rultor gives you is an isolation of your deployment script in its own virtual environment. We have this mostly thanks to Docker.

Visibility of Logs

Traditionally, we keep deployment scripts in some ~/deploy directory and run them with a magic set of parameters. In a small project, you do this yourself and this directory is on your own laptop. In a bigger project, there is a "deployment" server, that has that magic directory with a set of scripts that can be executed only by a few trusted senior developers. I've seen this setup many times.

The biggest issue here is traceability. It's almost impossible to find out who deployed what and why some particular deployment failed. The senior deployment gurus simply SSH to the server and run those magic scripts with magic parameters. Logs are usually lost and problem tracking is very difficult or impossible.

Rultor offers something different. With Rultor, there is no SSH access to deployment scripts any more. All scripts stay in the .rultor.yml configuration file, and you start them by posting messages in your issue tracking system (for example GitHub, JIRA or Trac). Rultor runs the script and publishes its full log right to your ticket. The log stays with your project forever. You can always get back to the ticket you were working with and check why deployment failed and what instructions were actually executed.

For example, check out this GitHub issue, where I was deploying a new version of Rultor itself, and failed a few times: yegor256/rultor#563. All my failed attempts are protocolled. I can always get back to them and investigate. For a big project this information is vital.

Thus, the second benefit of Rultor versus a standalone deployment script is visibility of every single operation.

Security of Credentials

When you have a custom script sitting in your laptop or in that secret team deployment server, your production credentials stay close to it. There is just no other way. If your software works with a database, it has to know login credentials (user name, password, DB name, port number, etc.). Well, in the worst case, some people just hard code that information right into the source code. We aren't even going to discuss this case, that's how bad it is.

But let's say you separate your DB credentials from the source code. You will have something like a db.properties or db.ini file, which will be attached to the application right before deployment. You can also keep that file directly in the production server, which is even better, but not always possible, especially with PaaS deployments, for example.

A similar problem exists with deployments of artifacts to repositories. Say, you're regularly deploying to RubyGems.org. Your ~/.gem/credentials will contain your secret API key.

So, very often, your deployment scripts are accompanied by some files with sensitive and secure information. And these files have this information in a plain, open format. No encryption, no protection. Just user names, passwords, codes and tokens in plain text.

Why is this bad? Well, for a single developer with a single laptop this doesn't sound like a problem. Although, I don't like the idea of losing a laptop somewhere in an airport with all credentials open and ready to be used. You may argue that there are disc protection tools, like FileVault for MacOS or BestCrypt for Windows. Yes, maybe.

But let's see what happens when we have a team of developers, working together and sharing those deployment scripts and files with credentials. Once you give access to your deployment scripts to a new member of the team, you have to share all that sensitive data. There is just no way around it. In order to use the scripts he/she has to be able to open files with credentials.

This is a problem, if you care about the security of your data.

Rultor solves this problem by offering an on-the-fly GPG decryption of your sensitive data, right before they are used by your deployment scripts. In the .rultor.yml configuration file you just say:

decrypt:
  db.ini: "repo/db.ini.asc"
deploy:
  script:
    ftp put db.ini production

Then, you encrypt your db.ini using a Rultor GPG key, and fearlessly commit db.ini.asc to the repository. Nobody will be able to open and read that file, except the Rultor server itself, right before running the deployment script.

Thus, the third benefit of Rultor versus a standalone deployment script is proper security of sensitive data.

© Yegor Bugayenko 2014–2018

Anti-Patterns in OOP

QR code

Anti-Patterns in OOP

  • comments

© Yegor Bugayenko 2014–2018

RESTful API and a Web Site in the Same URL

QR code

RESTful API and a Web Site in the Same URL

  • comments

Look at GitHub RESTful API, for example. To get information about a repository you should make a GET request to api.github.com/repos/yegor256/rultor. In response, you will get a JSON document with all the details of the yegor256/rultor repository. Try it, the URL doesn't require any authentication.

To open the same repository in a nice HTML+CSS page, you should use a different URL: github.com/yegor256/rultor. The URL is different, the server-side is definitely different, but the nature of the data is exactly the same. The only thing that changes is a representation layer.

In the first case, we get JSON; in the second---HTML.

How about combining them? How about using the same URL and the same server-side processing mechanism for both of them? How about shifting the whole rendering task to the client-side (the browser) and letting the server work solely with the data?

The Good, the Bad, The Wierd (2008) by Kim Jee-woon
The Good, the Bad, The Wierd (2008) by Kim Jee-woon

XSLT is the technology that can help us do this. In XML+XSLT in a Browser I explained briefly how it works in a browser. In a nutshell, the server returns an XML with some data and a link to the XSL stylesheet. The stylesheet, being executed in a browser, converts XML to HTML. XSL language is as powerful as any other rendering engine, like JSP, JSF, Tiles, or what have you. Actually, it is much more powerful.

Using this approach we literally remove the entire rendering layer ("View" in the MVC paradigm) from the server and move it to the browser.

If we can make it possible, the web server will expose just a RESTful API, and every response page will have an XSL stylesheet attached. What do we gain? We'll discuss later, at the end of the post. Now, let's see what problems we will face:

  1. JSON doesn't have a rendering layer. There is no such thing as XSLT for JSON. So, we will have to forget about JSON and stay with XML only. For me, this sounds perfectly all right. Others don't like XML and prefer to work with JSON only. Never understood them :)

  2. XSLT 2.0 is not supported by all browsers. Even XSLT 1.0 is only supported by some of them. For example, Internet Explorer 8 doesn't support XSLT at all.

  3. Browsers support only GET and POST HTTP methods, while traditional RESTful API-s exploit also, at least, PUT and DELETE.

The first problem is not really a problem. It's just a matter of taste (and level of education). The last two problems are much more serious. Let's discuss them.

XSL Transformation on the Server

XSLT is not supported by some browsers. How do we solve this?

I think that the best approach is to parse the User-Agent HTTP header in every request and make a guess, whether this particular version of the browser supports XSLT or not. It's not so difficult to do, since this compatibility information is public.

If the browser doesn't support XSLT, we can do the transformation on the server side. We already have the XML with data, generated by the server, and we already have the XSL attached to it. All we need to do is to apply the latter to the former and obtain an HTML page. Then, we return the HTML to the browser.

Besides that, we can also pay attention to the Accept header. If it is set to application/xml or text/xml, we return XML, no matter what User-Agent is saying. This means, basically, that some API client is talking to us, not a browser. And this client is not interested in HTML, but in pure data in XML format.

POST Instead of PUT

There is no workaround for this. Browsers don't know anything about PUT or DELETE. So, we should also forget them in our RESTful API-s. We should design our API using only two methods: GET and POST. Is this even possible? Yes. Why not? It won't look as fancy as with all six methods (some API-s also use OPTIONS and HEAD), but it will work.

What Do We Gain?

OK, here is the question---why do we need this? What's wrong with the way most people work now? Why can't we make a web site separate from the API? What benefits do we get if we combine them?

I've been combining them in all web applications I've worked with since 2011. And the biggest advantage I'm experiencing is avoiding code duplication.

It is obvious that in the server we don't duplicate controllers (in the case of MVC). We have one layer of controllers, and they control both the API and the web site (since they are one thing now).

Avoiding code duplication is a very important achievement. Moreover, I believe that it is the most important target for any software project.

These small web apps work exactly as explained above: s3auth.com, stateful.co, bibrarian.com. They are all open source, and you can see their source code in GitHub.

© Yegor Bugayenko 2014–2018

Simple Java SSH Client

QR code

Simple Java SSH Client

  • comments

An execution of a shell command via SSH can be done in Java, in just a few lines, using jcabi-ssh:

String hello = new Shell.Plain(
  new SSH(
    "ssh.example.com", 22,
    "yegor", "-----BEGIN RSA PRIVATE KEY-----..."
  )
).exec("echo 'Hello, world!'");

jcabi-ssh is a convenient wrapper of JSch, a well-known pure Java implementation of SSH2.

Here is a more complex scenario, where I upload a file via SSH and then read back its grepped content:

Shell shell = new SSH(
  "ssh.example.com", 22,
  "yegor", "-----BEGIN RSA PRIVATE KEY-----..."
);
File file = new File("/tmp/data.txt");
new Shell.Safe(shell).exec(
  "cat > d.txt && grep 'some text' d.txt",
  new FileInputStream(file),
  Logger.stream(Level.INFO, this),
  Logger.stream(Level.WARNING, this)
);

Class SSH, which implements interface Shell, has only one method, exec. This method accepts four arguments:

interface Shell {
  int exec(
    String cmd, InputStream stdin,
    OutputStream stdout, OutputStream stderr
  );
}

I think it's obvious what these arguments are about.

There are also a few convenient decorators that make it easier to operate with simple commands.

Shell.Safe

Shell.Safe decorates an instance of Shell and throws an exception if the exec exit code is not equal to zero. This may be very useful when you want to make sure that your command executed successfully, but don't want to duplicate if/throw in many places of your code.

Shell ssh = new Shell.Safe(
  new SSH(
    "ssh.example.com", 22,
    "yegor", "-----BEGIN RSA PRIVATE KEY-----..."
  )
);

Shell.Verbose

Shell.Verbose decorates an instance of Shell and copies stdout and stderr to the slf4j logging facility (using jcabi-log). Of course, you can combine decorators, for example:

Shell ssh = new Shell.Verbose(
  new Shell.Safe(
    new SSH(
      "ssh.example.com", 22,
      "yegor", "-----BEGIN RSA PRIVATE KEY-----..."
    )
  )
);

Shell.Plain

Shell.Plain is a wrapper of Shell that introduces a new exec method with only one argument, a command to execute. It also doesn't return an exit code, but stdout instead. This should be very convenient when you want to execute a simple command and just get its output (I'm combining it with Shell.Safe for safety):

String login = new Shell.Plain(new Shell.Safe(ssh)).exec("whoami");

Download

You need a single dependency jcabi-ssh.jar in your Maven project (get its latest version in Maven Central):

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-ssh</artifactId>
</dependency>

The project is in GitHub. If you have any problems, just submit an issue. I'll try to help.

© Yegor Bugayenko 2014–2018

How We Run as a Non-Root Inside Docker Container

QR code

How We Run as a Non-Root Inside Docker Container

  • comments

Docker starts a process inside its container as a "root" user. In some cases, this is not convenient though. For example, initdb from PostgreSQL doesn't like to be started as root and will fail. In rultor.com, a DevOps team assistant, we're using Docker as a virtualization technology for every build we run.

Here is how we change the user inside a running container, right after it is started.

First, this is how we start a new Docker container:

$ sudo docker run -i -t --rm -v "$(pwd):/main" \
  yegor256/rultor /main/entry.sh

There are two files in the current directory: entry.sh and script.sh. entry.sh is the file being executed by Docker on start, and it contains the following:

#!/bin/bash
adduser --disabled-password --gecos '' r
adduser r sudo
echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
su -m r -c /home/r/script.sh

script.sh will be executed as a user r inside the container. And this r user will have sudo permissions. This is exactly what all projects, managing their DevOps procedures with rultor.com, need.

© Yegor Bugayenko 2014–2018

How to Publish to RubyGems, in One Click

QR code

How to Publish to RubyGems, in One Click

  • comments

When I release a new version of jgd, a Ruby gem, to RubyGems.org, it takes 30 seconds of my time. Here is how I released a bug fix for version 1.5.1, in GitHub issue #6:

The figure

As you see, I gave a command to Rultor, and it released a new version to RubyGems. I didn't do anything else.

Now let's see how you can do the same. How you can configure your project so that the release of its new version to RubyGems.org takes just a few seconds of your time.

By the way, I assume that you're hosting your project in GitHub. If not, this entire tutorial won't work. If you are still not in GitHub, I would strongly recommend moving there.

Create RubyGems Account

Create an account in RubyGems.org.

Create rubygems.yml

Create a rubygems.yml file (you may already have it as ~/.gem/credentials):

:rubygems_api_key: d355d8940bb031bfe9acf03ed3da4c0d

You should get this API key from RubyGems. To find your API key, click on your user name when logged in to RubyGems.org and then click on "Edit Profile."

Encrypt rubygems.yml

Now, encrypt rubygems.yml with a rultor remote:

$ gem install rultor
$ rultor encrypt -p me/test rubygems.yml

Instead of me/test use the name of your GitHub project.

You will get a new file rubygems.yml.asc. Add this file to the root directory of your project, commit and push. The file contains your secret information, but only the Rultor server can decrypt it.

Prepare Gemspec

In your gemspec file, make sure you use 1.0.snapshot as a version number:

# coding: utf-8
Gem::Specification.new do |s|
  # ...
  s.version = '1.0.snapshot'
  # ...
end

This version name will be replaced by Rultor during deployment.

Configure Rultor

Create a .rultor.yml file in the root directory of your project:

decrypt:
  rubygems.yml: "repo/rubygems.yml.asc"
release:
  script: |
    rm -rf *.gem
    sed -i "s/1.0.snapshot/${tag}/g" foo.gemspec
    gem build foo.gemspec
    chmod 0600 /home/r/rubygems.yml
    gem push *.gem --config-file /home/r/rubygems.yml

In this example, replace foo with the name of your gem.

Run It!

badge

Now it's time to see how it all works. Create a new ticket in the GitHub issue tracker, and post something like that into it (read more about Rultor commands):

@rultor release, tag is `0.1`

You will get a response in a few seconds. The rest will be done by Rultor.

Enjoy :)

BTW, if something doesn't work as I've explained, don't hesitate to submit a ticket to Rultor issue tracker. I will try to help you.

© Yegor Bugayenko 2014–2018

How to Deploy to CloudBees, in One Click

QR code

How to Deploy to CloudBees, in One Click

  • comments

When I deploy a new version of stateful.co, a Java web application, to CloudBees, it takes 30 seconds of my time. Maybe even less. Recently, I deployed version 1.6.5. You can see how it all happened, in GitHub issue #6:

The figure

As you see, I gave a command to Rultor, and it packaged, tested and deployed a new version to CloudBees. I didn't do anything else.

Now let's see how you can do the same. How you can configure your project so that the deployment of its new version to CloudBees takes just a few seconds of your time.

Since CloudBees is shutting down its PaaS service by the end of December, 2014, this article will have no sense after that.

Configure the CloudBees Maven Plugin

Add this profile to your pom.xml:

<project>
  [..]
  <profiles>
    <profile>
      <id>cloudbees</id>
      <activation>
        <property><name>bees.appId</name></property>
      </activation>
      <pluginRepositories>
        <pluginRepository>
          <id>cloudbees-public-release</id>
          <url>
            http://repository-cloudbees.forge.cloudbees.com/public-release
          </url>
        </pluginRepository>
      </pluginRepositories>
      <build>
        <pluginManagement>
          <plugins>
            <plugin>
              <artifactId>maven-deploy-plugin</artifactId>
              <configuration>
                  <skip>true</skip>
              </configuration>
            </plugin>
          </plugins>
        </pluginManagement>
        <plugins>
          <plugin>
            <groupId>com.cloudbees</groupId>
            <artifactId>bees-maven-plugin</artifactId>
            <version>1.3.2</version>
            <configuration>
              <appid>${bees.id}</appid>
              <apikey>${bees.key}</apikey>
              <secret>${bees.secret}</secret>
            </configuration>
            <executions>
              <execution>
                <id>deploy-to-production</id>
                <phase>deploy</phase>
                <goals>
                  <goal>deploy</goal>
                </goals>
              </execution>
            </executions>
          </plugin>
        </plugins>
      </build>
    </profile>
  </profiles>
</project>

This plugin is not in Maven Central (unfortunately). That's why we have to specify that <pluginRepository>.

Pay attention to the fact that we're also disabling maven-deploy-plugin, since it would try to deploy your WAR package to the repository from the <distributionManagement> section. We want to avoid this.

The profile gets activated only when the bees.id property is defined. This won't happen during your normal development and testing, but it will occur during the deployment cycle initiated by Rultor, because we will define this property in settings.xml (discussed below).

Secure Access to CloudBees

Create an account in CloudBees and register your web application there. CloudBees is free, as long as you don't need too much computing power. I believe that web applications should be light-weight by definition, so CloudBees' free layer is an ideal choice.

Create a settings.xml file (but don't commit it to your repo!):

<settings>
  <profiles>
    <profile>
      <id>cloudbees</id>
      <properties>
        <bees.id>stateful/web</bees.id>
        <bees.key><!-- your key --></bees.key>
        <bees.secret><!-- your secret --></bees.secret>
      </properties>
    </profile>
  </profiles>
</settings>

Encrypt this file using rultor remote:

$ gem install rultor
$ rultor encrypt -p me/test settings.xml

Instead of me/test use the name of your GitHub project.

You should get a settings.xml.asc file; add it to the root directory of your project, commit and push. This file contains your CloudBees credentials, but in an encrypted format. Nobody can read it, except the Rultor server.

Configure Versions Plugin

I recommend using jcabi-parent. It configures the required plugin out-of-the-box. If you're using it, skip this step.

Otherwise, add this plugin to your pom.xml:

<project>
  [..]
  <build>
    [..]
    <plugins>
      [..]
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>versions-maven-plugin</artifactId>
        <version>2.1</version>
        <configuration>
          <generateBackupPoms>false</generateBackupPoms>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

Configure Rultor

Create a .rultor.yml file in the root directory of your project (this reference page explains this format in detail):

decrypt:
  settings.xml: "repo/settings.xml.asc"
release:
  script: |
    mvn versions:set "-DnewVersion=${tag}"
    git commit -am "${tag}"
    mvn clean deploy --settings /home/r/settings.xml

You can compare your file with live Rultor configuration of stateful.co.

Run It!

badge

Now it's time to see how it all works. Create a new ticket in the GitHub issue tracker, and post something like that into it (read more about Rultor commands):

@rultor release, tag is `0.1`

You will get a response in a few seconds. The rest will be done by Rultor.

Enjoy :)

BTW, if something doesn't work as I've explained, don't hesitate to submit a ticket to the Rultor issue tracker. I will try to help you.

Also, a similar configuration can be performed for Heroku (using jcabi-heroku-maven-plugin) and for AWS Elastic Beanstalk (using jcabi-beanstalk-maven-plugin). I'll probably dedicate individual posts to them, as well.

© Yegor Bugayenko 2014–2018

The Art of Software Testing by Glenford Myers

QR code

The Art of Software Testing by Glenford Myers

  • comments
badge

"The Art of Software Testing"" by Glenford J. Myers, Tom Badgett and Corey Sandler is one of my favorite books concerning testing and software engineering in general. In this article, I will provide an overview of the book, as well as highlight the ideas and quotes that I found to be the most interesting.

There were three editions of the book. The first one was published in 1979, when I was just too young to appreciate it. The second one was published in 2004---I read it first in 2007. The third one was published just two years ago, in 2012. I bought this edition also, and read it like it was my first time. This book is still one of the top books in the software testing domain, despite its age and some content that is rather out-dated.

Out-dated Content

First, let's filter out what is not worth reading (in my opinion).

There are eleven chapters, but you can easily skim through nine of them. This is because those chapter discuss concepts that are discussed elsewhere in the book with a more robust level of detail or on a much higher level of abstraction.

For example, Chapter 3 contains an eleven-page checklist to be used by a code reviewer in order to find programming mistakes. This list is definitely not comprehensive and it can't compete with, say, "Code Complete" by Steve McConnell. I believe, this checklist had significant value twenty years ago, but now it is out of date.

Chapter 5 discusses basic principles and strategies of unit testing. However, the discussion is not abstract enough for a short 25-page summary, and is not specific enough for a detailed discussion. Again, twenty years ago this information may have had some value. Nowadays, "Growing Object-Oriented Software, Guided by Tests" by Steven Freeman and Nat Pryce is a much better source for this subject.

There are also articles about usability testing, debugging, web application testing, and mobile testing. Here we have the same issue---they are not abstract enough and they are much too outdated to be relevant to the current issues in software testing. I would recommend readers to briefly skim those subjects for background information, but to not read too much into it.

Psychology of Testing

The most important and valuable part of the book is Chapter 2. It is full of priceless quotes that can also be very practical. For example, on page 6:

Testing is a destructive, even sadistic, process, which explains why most people find it difficult

badge

In Chapter 2, Dr. Myers discusses the psychology of testing and a very common and crucial misunderstanding of testing objectives. He claims that it is commonly accepted that the goal of software testing is "to show that a program performs its intended functions correctly" (p.5). Testers are hired to check whether the software functions as expected. They then report back to management whether all tests have successfully passed and whether the program can be delivered to end users.

Despite the plethora of software testing tomes available on the market today, many developers seem to have an attitude that is counter to extensive testing

This is what Dr. Myers says on the second page, and I can humbly confirm that in all software groups I've been worked in thus far, almost everyone, including testers, project managers, and programmers, share this philosophy. They all believe that "testing is the process of demonstrating that errors are not present" (p.5)

However, "these definitions are upside down" (p.6). The psychology of testing should be viewed as the opposite. There are two quotes that support this theory and I feel that they make the entire book.

The first one, on page 6, defines the goal of software testing:

Testing is the process of executing a program with the intent of finding errors

The second one, on the following page, further refines the first goal:

An unsuccessful test case is one that causes a program to produce the correct results without finding any errors

Dr. Myers comes back to these two thoughts in every chapter. He reiterates over and over again that we should change the underlying psychology of how we view testing, in order to change our testing results. We should focus on breaking the software instead of confirming that it works. Because testing is a "sadistic process" (p.6) of breaking things. It is a "destructive process" (p.8).

If you read Chapter2 very carefully and truly understand its underlying ideas, it may change your entire life :) This chapter should be a New Testament of every tester.

Test Completion Criteria

In Chapter 2, Dr. Myers also mentions that a program, no matter how simple, contains an unlimited number of errors. He says that "you cannot test a program to guarantee that it is error free" (p.10) and that "it is impractical, often impossible, to find all the errors in a program" (p.8).

Furthermore, at the end of Chapter 6, he makes an important observation (p.135):

One of the most difficult questions to answer when testing a program is determining when to stop, since there is no way of knowing if the error just detected is the last remaining error

The problem is obvious. Since any program contains an unlimited number of errors, it doesn't matter how long we test, we won't find all of them. So when do we stop? What goals do we set for our testers? And even more importantly, when do we pay them and how much (this question is important to me since I only work with contractors and am required to define measurable and achievable goals)?

The answer Dr. Myers gives is brilliant (p.136):

Since the goal of testing is to find errors, why not make the completion criterion the detection of some predefined number of errors?

He then goes on to discuss exactly how this "predefined number" can be estimated. I find this idea very interesting. I have even applied it to a few projects I've had in the last few years. It works. However it can also cause serious psychological problems for the team. Most people simply resent the goal of "testing until you find a required number of bugs." The most common response is "what if there are no bugs any more?."

However, after a few fights, the team eventually begins to appreciate the concept and get used to it. So, I can humbly confirm that Dr. Myers is right in his suggestion. You can successful plan testing based on a predefined number of errors.

Summary

I consider this book a fundamental writing in the area of software testing. This is mostly due to Chapter 2 of the book. In fact, there are just three pages of text that build the foundation of the entire book. They are the skeleton of the other two hundred pages.

Unfortunately, since 1979, this skeleton hasn't become the backbone of the software testing industry. Most of us are still working against these principles.

© Yegor Bugayenko 2014–2018

How to Release to Maven Central, in One Click

QR code

How to Release to Maven Central, in One Click

  • comments

When I release a new version of jcabi-aspects, a Java open source library, to Maven Central, it takes 30 seconds of my time. Maybe even less. Recently, I released version 0.17.2. You can see how it all happened, in GitHub issue #80:

The figure

As you see, I gave a command to Rultor, and it released a new version to Maven central. I didn't do anything else.

Now let's see how you can do the same. How you can configure your project so that the release of its new version to Maven Central takes just a few seconds of your time.

By the way, I assume that you're hosting your project in GitHub. If not, this entire tutorial won't work. If you are still not in GitHub, I would strongly recommend moving there.

Prepare Your POM

Make sure your pom.xml contains all elements required by Sonatype, explained in Central Sync Requirements. We will deploy to Sonatype, and they will synchronize all JAR (and not only) artifacts to Maven Central.

Register a Project With Sonatype

Create an account in Sonatype JIRA and raise a ticket, asking to approve your groupId. This OSSRH Guide explains this step in more detail.

Create and Distribute a GPG Key

Create a GPG key and distribute it, as explained in this Working with PGP Signatures article.

When this step is done, you should have two files: pubring.gpg and secring.gpg.

Create settings.xml

Create settings.xml, next to the two .gpg files created in the previous step:

<settings>
  <profiles>
    <profile>
      <id>foo</id> <!-- give it the name of your project -->
      <properties>
        <gpg.homedir>/home/r</gpg.homedir>
        <gpg.keyname>9A105525</gpg.keyname>
        <gpg.passphrase>my-secret</gpg.passphrase>
      </properties>
    </profile>
  </profiles>
  <servers>
    <server>
      <id>oss.sonatype.org</id>
      <username><!-- Sonatype JIRA user name --></username>
      <password><!-- Sonatype JIRA pwd --></password>
    </server>
  </servers>
</settings>

In this example, 9A105525 is the ID of your public key, and my-secret is the pass phrase you have used while generating the keys.

Encrypt Security Assets

Now, encrypt these three files with a rultor remote:

$ gem install rultor
$ rultor encrypt -p me/test pubring.gpg
$ rultor encrypt -p me/test secring.gpg
$ rultor encrypt -p me/test settings.xml

Instead of me/test you should use the name of your GitHub project.

You will get three new files: pubring.gpg.asc, secring.gpg.asc and settings.xml.asc. Add them to the root directory of your project, commit and push. The files contain your secret information, but only the Rultor server can decrypt them.

Add Sonatype Repositories

I would recommend using jcabi-parent, as a parent pom for your project. This will make many further steps unnecessary. If you're using jcabi-parent, skip this step.

However, if you don't use jcabi-parent, you should add these two repositories to your pom.xml:

<project>
  [...]
  <distributionManagement>
    <repository>
      <id>oss.sonatype.org</id>
      <url>https://oss.sonatype.org/service/local/staging/deploy/maven2/</url>
    </repository>
    <snapshotRepository>
      <id>oss.sonatype.org</id>
      <url>https://oss.sonatype.org/content/repositories/snapshots</url>
    </snapshotRepository>
  </distributionManagement>
</project>

Configure GPG Plugin

Again, I'd recommend using jcabi-parent, which configures this plugin automatically. If you're using it, skip this step.

Otherwise, add this plugin to your pom.xml:

<project>
  [..]
  <build>
    [..]
    <plugins>
      [..]
      <plugin>
        <artifactId>maven-gpg-plugin</artifactId>
        <version>1.5</version>
        <executions>
          <execution>
            <id>sign-artifacts</id>
            <phase>verify</phase>
            <goals>
              <goal>sign</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Configure Versions Plugin

Once again, I recommend using http://parent.jcabi.com. It configures all required plugins out-of-the-box. If you're using it, skip this step.

Otherwise, add this plugin to your pom.xml:

<project>
  [..]
  <build>
    [..]
    <plugins>
      [..]
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>versions-maven-plugin</artifactId>
        <version>2.1</version>
        <configuration>
          <generateBackupPoms>false</generateBackupPoms>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

Configure Sonatype Plugin

Yes, you're right, http://parent.jcabi.com will help you here as well. If you're using it, skip this step too.

Otherwise, add these four plugins to your pom.xml:

<project>
  [..]
  <build>
    [..]
    <plugins>
      [..]
      <plugin>
        <artifactId>maven-deploy-plugin</artifactId>
        <configuration>
          <skip>true</skip>
        </configuration>
      </plugin>
      <plugin>
        <artifactId>maven-source-plugin</artifactId>
        <executions>
          <execution>
            <id>package-sources</id>
            <goals>
              <goal>jar</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <artifactId>maven-javadoc-plugin</artifactId>
        <executions>
          <execution>
            <id>package-javadoc</id>
            <phase>package</phase>
            <goals>
              <goal>jar</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>org.sonatype.plugins</groupId>
        <artifactId>nexus-staging-maven-plugin</artifactId>
        <version>1.6</version>
        <extensions>true</extensions>
        <configuration>
          <serverId>oss.sonatype.org</serverId>
          <nexusUrl>https://oss.sonatype.org/</nexusUrl>
          <description>${project.version}</description>
        </configuration>
        <executions>
          <execution>
            <id>deploy-to-sonatype</id>
            <phase>deploy</phase>
            <goals>
              <goal>deploy</goal>
              <goal>release</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Create Rultor Configuration

Create a .rultor.yml file in the root directory of your project (reference page explains this format in details):

decrypt:
  settings.xml: "repo/settings.xml.asc"
  pubring.gpg: "repo/pubring.gpg.asc"
  secring.gpg: "repo/secring.gpg.asc"
release:
  script: |
    mvn versions:set "-DnewVersion=${tag}"
    git commit -am "${tag}"
    mvn clean deploy -Pjcabi --settings /home/r/settings.xml

You can compare your file with live Rultor configuration of jcabi-aspects.

Run It!

badge

Now it's time to see how it all works. Create a new ticket in the GitHub issue tracker, and post something like that into it (read more about Rultor commands):

@rultor release, tag is `0.1`

You will get a response in a few seconds. The rest will be done by Rultor.

Enjoy :)

BTW, if something doesn't work as I've explained, don't hesitate to submit a ticket to Rultor issue tracker. I will try to help you.

Yeah, forgot to mention, Rultor is also doing two important things. First, it creates a GitHub release with a proper description. Second, it posts a tweet about the release, which you can retweet, to make an announcement to your followers. Both features are very convenient for me. For example:

© Yegor Bugayenko 2014–2018

Fluent JDBC Decorator

QR code

Fluent JDBC Decorator

  • comments
badge

This is how you fetch text from a SQL table with jcabi-jdbc:

String name = new JdbcSession(source)
  .sql("SELECT name FROM employee WHERE id = ?")
  .set(1234)
  .select(new SingleOutcome<String>(String.class));

Simple and straight forward, isn't it? The library simplifies interaction with relational databases via JDBC, avoiding the need to use ORM.

jcabi-jdbc is a lightweight wrapper of JDBC. It is very convenient to use when you don't need a full-scale ORM (like Hibernate), but want just to select, insert, or update a few rows in a relational database.

Every instance of JdbcSession is a "transaction" in a database. You start it by instantiating the class with a single parameter---data source.

You can obtain the data source from your connection pool. There are many implementations of connection pools. I would recommend that you use BoneCP. Below is an example of how you would connect to PostgreSQL:

@Cacheable(forever = true)
private static DataSource source() {
  BoneCPDataSource src = new BoneCPDataSource();
  src.setDriverClass("org.postgresql.Driver");
  src.setJdbcUrl("jdbc:postgresql://localhost/db_name");
  src.setUser("jeff");
  src.setPassword("secret");
  return src;
}

Be sure to pay attention to the @Cacheable annotation. This post explains how it can help you to cache Java method results for some time. Setting the forever attribute to true means that we don't want this method to be called more than once. Instead, we want the connection pool to be created just once, and every second call should return its existing instance (kind of like a Singleton pattern).

jcabi-jdbc website explains how you can insert, update, or delete a row. You can also execute any SQL statement.

By default, JdbcSession closes the JDBC connection right after the first select/update/insert operation. Simply put, it is designed to be used mainly for single atomic transactions. However, it is possible to leave the connection open and continue, for example:

new JdbcSession(source)
  .autocommit(false)
  .sql("START TRANSACTION")
  .update()
  .sql("DELETE FROM employee WHERE name = ?")
  .set("Jeff Lebowski")
  .update()
  .sql("INSERT INTO employee VALUES (?)")
  .set("Walter Sobchak")
  .insert(Outcome.VOID)
  .commit();

In this example we're executing three SQL statements one by one, leaving connection (and transaction) open until commit() is called.

© Yegor Bugayenko 2014–2018

How to Retry Java Method Call on Exception

QR code

How to Retry Java Method Call on Exception

  • comments
badge

If you have a method that fails occasionally and you want to retry it a few times before throwing an exception. @RetryOnFailure from jcabi-aspects can help. For example, if you're downloading the following web page:

@RetryOnFailure(
  attempts = 3,
  delay = 10,
  unit = TimeUnit.SECONDS
)
public String load(URL url) {
  return url.openConnection().getContent();
}

This method call will throw an exception only after three failed executions with a ten seconds interval between them.

This post explains how jcabi-aspects works with binary weaving. This mechanism integrates AspectJ with your code.

When method load() from the example above is called, this is what is happening behind the scene (pseudo-code):

while (attempts++ < 3) {
  try {
    return original_load(url);
  } catch (Throwable ex) {
    log("we failed, will try again in 10 seconds");
    sleep(10);
  }
}

This approach may be very useful in the following situations (based on my experience):

  • Executing JDBC SELECT statements

  • Loading data from HTTP, S3, FTP, etc resources

  • Uploading data over the network

  • Fetching data through RESTful stateless API-s

The project is in GitHub.

© Yegor Bugayenko 2014–2018

Strict Control of Java Code Quality

QR code

Strict Control of Java Code Quality

  • comments

There are many tools that control the quality of Java code, including Checkstyle, PMD, FindBugs, Cobertura, etc. All of them are usually used to analyze quality and build some fancy reports. Very often, those reports are published by continuous integration servers, like Jenkins.

Ratatouille (2007) by Brad Bird and Jan Pinkava
Ratatouille (2007) by Brad Bird and Jan Pinkava

Qulice takes things one step further. It aggregates a few quality checkers, configures them to a maximum strict mode, and breaks your build if any of them fail.

badge

Seriously. There are over 130 checks in Checkstyle, over 400 rules in PMD, and over 400 bugs in FindBugs. All of them should say: "Yes, we like your code." Otherwise, your build shouldn't pass.

What do you think? Would it be convenient for you---to have your code rejected every time it breaks just one of 900 checks? Would it be productive for the team---to force developers to focus so much on code quality?

First Reaction

If you join one of our teams as a Java developer, you will develop your code in branches and, then, Rultor will merge your changes into master. Before actually merging, though, Rultor will run an automated build script to make sure that your branch doesn't break it.

As a static analysis tool, Qulice is just one of the steps in the automated build script. It is actually a Maven plugin and we automate Java builds with Maven 3x. Thus, if your changes break any of Qulice's rules, your entire branch gets rejected.

Your first reaction---I've seen it hundreds of times---will be negative. You may actually become frustrated enough to leave the project immediately. You may say something like this (I'm quoting real life stories):

  • "These quality rules entirely ruin my creativity!"

  • "Instead of wasting time on these misplaced commas and braces, we'd be better off developing new features!"

  • "I've done many successful projects in my life, never heard about this ridiculous quality checking..."

This first reaction is only logical. I've seen many people say things like this, in both open source and commercial projects. Not only in Java, but also in PHP (with phpcs and phpmd) and Ruby (with rubocop and simplecov).

How do I answer? Read on.

On Second Thought

My experience tells me that the sooner someone can get used to the strict quality control of Qulice, the faster he/she can learn and grow; the better programmer he/she is; and the further he/she can go with us and our projects.

Having this experience in mind, I recommend that all new project members be patient and try to get used to this new approach to quality. In a few weeks, those who stick with it start to understand why this approach is good for the project and for them, as Java engineers.

Why is it good? Read on.

What Do Projects Get From Qulice?

Let's take one simple rule as an example. Here is a piece of Java code that Qulice would complain about (due to the DesignForExtension rule from Checkstyle):

public class Employee {
  public String name() {
    return "Jeff";
  }
}

What is wrong with this code? Method name() is not final and can be overridden by a class that extends Employee. Design-wise this is wrong, since a child class is allowed to break a super class, overriding its method.

What is the right design? This one:

public class Employee {
  public final String name() {
    return "Jeff";
  }
}

Now, the method is final and can't be overridden by child classes. It is a much safer design (according to Checkstyle, and I agree).

So, let's say we make this rule mandatory for all classes in the project. What does the project gain from this? It can promise its members (programmers) a higher quality of work, compared to other projects that don't have this restriction, mostly because of:

  • Predictability of Design---I don't have to scroll through the entire class to make sure it doesn't have methods that can be accidentally overridden. I know for sure that this can't happen in this project. In other words, I know what to expect.

  • Less Hidden Tricks---Higher predictability of design leads to better visibility of mistakes and tricks. Standardization of source code makes it uniform. This means that it's easier to read and spot problems.

  • Industry Standards---The decision to use this design is made by Checkstyle, not by a project architect. For me, as a project developer, this means that I'm following industry standards. That makes the project (and its leaders) more respectable.

  • Learning---I'll bet that most of you who read this post didn't know about the design rule explained above. Just by reading this article, you learned something new. Imagine how much you could learn after making your code compliant to all 900 rules of Qulice (Checkstyle + PMD + FindBugs).

The point about learning brings us to the last, and the most important, thought to discuss.

What Do I Get from Qulice?

As a programmer, I hope you already realize what you get from working in a project that raises its quality bar as high as Qulice asks. Yes, you'll learn a lot of new things about writing quality Java code.

On top of that though, I would actually say that you are getting free lessons with every new line of code you write. And the teacher is a software, written by hundreds of Java developers, for the last ten years. Qulice just integrates those software tools together. Truthfully, it is the developers who are the real authors of quality checks and rules.

So, what do I tell those who complain about quality rules being too strict? I say this: "Do you want to learn and improve, or do you just want to get paid and get away with it?"

ps. You can use my settings.jar for IntelliJ, they are rather strict and will help you clean your code even before Qulice starts to complain.

© Yegor Bugayenko 2014–2018

Cache Java Method Results

QR code

Cache Java Method Results

  • comments
badge

Say, you have a method that takes time to execute and you want its result to be cached. There are many solutions, including Apache Commons JCS, Ehcache, JSR 107, Guava Caching and many others.

jcabi-aspects offers a very simple one, based on AOP aspects and Java6 annotations:

import com.jcabi.aspects.Cacheable;
public class Page {
  @Cacheable(lifetime = 5, unit = TimeUnit.MINUTES)
  String load() {
    return new URL("http://google.com").getContent().toString();
  }
}

The result of load() method will be cached in memory for five minutes.

How It Works?

This post about AOP, AspectJ and method logging explains how "aspect weaving" works (I highly recommend that you read it first).

Here I'll explain how caching works.

The approach is very straight forward. There is a static hash map with keys as "method coordinates" and values as their results. Method coordinates consist of the object, an owner of the method and a method name with parameter types.

In the example above, right after the method load() finishes, the map gets a new entry (simplified example, of course):

key: [page, "load()"]
value: "<html>...</html>"

Every consecutive call to load() will be intercepted by the aspect from jcabi-aspects and resolved immediately with a value from the cache map. The method will not get any control until the end of its lifetime, which is five minutes in the example above.

What About Cache Flushing?

Sometimes it's necessary to have the ability to flush cache before the end of its lifetime. Here is a practical example:

import com.jcabi.aspects.Cacheable;
public class Employees {
  @Cacheable(lifetime = 1, unit = TimeUnit.HOURS)
  int size() {
    // calculate their amount in MySQL
  }
  @Cacheable.FlushBefore
  void add(Employee employee) {
    // add a new one to MySQL
  }
}

It's obvious that the number of employees in the database will be different after add() method execution and the result of size() should be invalidated in cache. This invalidation operation is called "flushing" and @Cacheable.FlushBefore triggers it.

Actually, every call to add() invalidates all cached methods in this class, not only size().

There is also @Cacheable.FlushAfter. The difference is that FlushBefore guarantees that cache is already invalidated when the method add() starts. FlushAfter invalidates cache after method add() finishes. This small difference makes a big one, sometimes.

This article explains how to add jcabi-aspects to your project.

© Yegor Bugayenko 2014–2018

Rultor + Travis

QR code

Rultor + Travis

badge
badge

Rultor is a coding team assistant. Travis is a hosted continuous integration system. In this article I'll show how our open source projects are using them in tandem to achieve seamless continuous delivery.

I'll show a few practical scenarios.

Scenario #1: Merge Pull Request

jcabi-mysql-maven-plugin is a Maven plugin for MySQL integration testing. @ChristianRedl submitted pull request #35 with a new feature. I reviewed the request and asked Rultor to merge it into master:

The figure

As you can see, an actual merge operation was made by Rultor. I gave him access to the project by adding his GitHub account to the list of project collaborators.

Before giving a "go ahead" to Rultor I checked the status of the pre-build reported by Travis:

The figure

Travis found a new commit in the pull request and immediately (without any interaction from my side) triggered a build in that branch. The build didn't fail, that's why Travis gave me a green sign. I looked at that sign and at the code. Since all problems in the code were corrected by the pull request author and Travis didn't complain---I gave a "go" to Rultor.

Scenario #2: Continuous Integration

Even though the previous step guarantees that master branch is always clean and stable, we're using Travis to continuously integrate it. Every commit made to master triggers a new build in Travis. The result of the build changes the status of the project in Travis: either "failing" or "passing."

jcabi-aspects is a collection of AOP AspectJ aspects. We configured Travis to build it continuously. This is the badge it produces (the left one):

The figure

Again, let me stress that even through read-only master is a strong protection against broken builds, it doesn't guarantee that at any moment master is stable. For example, sometimes unit tests fail sporadically due to changes in calendar, in environment, in dependencies, in network connection qualities, etc.

Well, ideally, unit tests should either fail or pass because they are environment independent. However, in reality, unit tests are far from being ideal.

That's why a combination of read-only master with Rultor and continuous integration with Travis gives us higher stability.

Scenario #3: Release to RubyGems

jekyll-github-deploy is a Ruby gem that automates deployment of Jekyll sites to GitHub Pages. @leucos submitted a pull request #4 with a new feature. The request was merged successfully into master branch.

Then, Rultor was instructed by myself that master branch should be released to RubyGems and a new version to set is 1.5:

The figure

Rultor executed a simple script, pre-configured in its .rultor.yml:

release:
  script: |
    ./test.sh
    rm -rf *.gem
    sed -i "s/2.0-SNAPSHOT/${tag}/g" jgd.gemspec
    gem build jgd.gemspec
    chmod 0600 ../rubygems.yml
    gem push *.gem --config-file ../rubygems.yml

The script is parameterized, as you see. There is one parameter that is passed by Rultor into the script: ${tag}. This parameter was provided by myself in the GitHub issue, when I submitted a command to Rultor.

The script tests that the gem works (integration testing) and clean up afterwords:

$ ./test.sh
$ rm -rf *.gem

Then, it changes the version of itself in jgd.gemspec to the one provided in the ${tag} (it is an environment variable), and builds a new .gem:

$ sed -i "s/2.0-SNAPSHOT/${tag}/g" jgd.gemspec
$ gem build jgd.gemspec

Finally, it pushes a newly built .gem to RubyGems, using login credentials from ../rubygems.yml. This file is created by Rultor right before starting the script (this mechanism is discussed below):

$ chmod 0600 ../rubygems.yml
$ gem push *.gem --config-file ../rubygems.yml

If everything works fine and RubyGems confirms successful deployment, Rultor reports to GitHub. This is exactly what happened in pull request #4.

Scenario #4: Deploy to CloudBees

s3auth.com is a Basic HTTP authentication gateway for Amazon S3 Buckets. It is a Java web app. In its pull request #195, a resource leakage problem was fixed by @carlosmiranda and the pull request was merged by Rultor.

Then, @davvd instructed Rultor to deploy master branch to production environment. Rultor created a new Docker container and ran mvn clean deploy in it.

Maven deployed the application to CloudBees:

The figure

The overall procedure took 21 minutes, as you see the in the report generated by Rultor.

There is one important trick worth mentioning. Deployment to production always means using secure credentials, like login, password, SSH keys, etc.

In this particular example, Maven CloudBees Plugin needed API key, secret and web application name. These three parameters are kept secure and can't be revealed in an "open source" way.

So, there is a mechanism that configures Rultor accordingly through its .rultor.yml file (pay attention to the first few lines):

assets:
  settings.xml: "yegor256/home#assets/s3auth/settings.xml"
  pubring.gpg: "yegor256/home#assets/pubring.gpg"
  secring.gpg: "yegor256/home#assets/secring.gpg"

These YAML entries inform Rultor that it has to get assets/s3auth/settings.xml file from yegor256/home private (!) GitHub repository and put it into the working directory of Docker container, right before starting the Maven build.

This settings.xml file contains that secret data CloudBees plugin needs in order to deploy the application.

How to Deploy to CloudBees, in One Click explains this process even better.

You Can Do The Same

Both Rultor and Travis are free hosted products, provided your projects are open source and hosted at GitHub.

Other good examples of Rultor+Travis usage can be seen in these GitHub issues: jcabi/jcabi-http#47, jcabi/jcabi-http#48

PS. You can do something similar with AppVeyor, for Windows platform: How AppVeyor Helps Me to Validate Pull Requests Before Rultor Merges Them

© Yegor Bugayenko 2014–2018

Every Build in Its Own Docker Container

QR code

Every Build in Its Own Docker Container

badge

Docker is a command line tool that can run a shell command in a virtual Linux, inside an isolated file system. Every time we build our projects, we want them to run in their own Docker containers. Take this Maven project for example:

$ sudo docker run -i -t ubuntu mvn clean test
badge

This command will start a new Ubuntu system and execute mvn clean test inside it. Rultor.com, our virtual assistant, does exactly that with our builds, when we deploy, package, test and merge them.

Why Docker?

What benefits does it give us? And why Docker, when there are many other virtualization technologies, like LXC, for example?

Well, there are a few very important benefits:

  • Image repository (hub.docker.com)

  • Versioning

  • Application-centric

Let's discuss them in details.

Image Repository

Docker enables image sharing through its public repository at hub.docker.com. This means that after I prepare a working environment for my application, I make an image out of it and push it to the hub.

Let's say, I want my Maven build to be executed in a container with a pre-installed graphviz package (in order to enable dot command line tool). First, I would start a plain vanilla Ubuntu container, and install graphviz inside it:

$ sudo docker run -i -t ubuntu /bin/bash
root@215d2696e8ad:/# sudo apt-get install -y graphviz
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
...
root@215d2696e8ad:/# exit
$ sudo docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS                     PORTS               NAMES
215d2696e8ad        ubuntu:14.04        /bin/bash           About a minute ago   Exited (0) 3 seconds ago                       high_mccarthy

I have a container that stopped a few seconds ago. Container's ID is 215d2696e8ad. Now, I want to make it reusable for all further tests in Rultor.com. I have to create an image from it:

$ sudo docker commit 215d2696e8ad yegor256/beta
c5ad7718fc0e20fe4bf2c8a9bfade4db8617a25366ca5b64be2e1e8aa0de6e52

I just made my new commit to a new image yegor256/beta. This image can be reused right now. I can create a new container from this image and it will have graphviz installed inside!

Now it's time to share my image at Docker hub, in order to make it available for Rultor:

$ sudo docker push yegor256/beta
The push refers to a repository [yegor256/beta] (len: 1)
Sending image list
Pushing repository yegor256/beta (1 tags)
511136ea3c5a: Image already pushed, skipping
d7ac5e4f1812: Image already pushed, skipping
2f4b4d6a4a06: Image already pushed, skipping
83ff768040a0: Image already pushed, skipping
6c37f792ddac: Image already pushed, skipping
e54ca5efa2e9: Image already pushed, skipping
c5ad7718fc0e: Image successfully pushed
Pushing tag for rev [c5ad7718fc0e] on {https://registry-1.docker.io/v1/repositories/yegor256/beta/tags/latest}

The last step is to configure Rultor to use this image in all builds. To do this, I will edit .rultor.yml in the root directory of my GitHub repository:

docker:
  image: yegor256/beta

That's it. From now on, Rultor will use my custom Docker image with pre-installed graphviz, in every build (merge, release, deploy, etc.)

Moreover, if and when I want to add something else to the image, it's easy to do. Say, I want to install Ruby into my build image. I start a container from the image and install it (pay attention, I'm starting a container not from ubuntu image, as I did before, but from yegor256/beta):

$ sudo docker run -i -t yegor256/beta /bin/bash
root@7e0fbd9806c9:/# sudo apt-get install -y ruby
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
...
root@7e0fbd9806c9:/# exit
$ sudo docker ps -a
CONTAINER ID        IMAGE                  COMMAND             CREATED             STATUS                     PORTS               NAMES
7e0fbd9806c9        yegor256/beta:latest   /bin/bash           28 seconds ago      Exited (0) 2 seconds ago                       pensive_pare
215d2696e8ad        ubuntu:14.04           /bin/bash           10 minutes ago      Exited (0) 8 minutes ago                       high_mccarthy

You can now see that I have two containers. The first one is the one I am using right now; it contains Ruby. The second one is the one I was using before and it contains graphviz.

Now I have to commit again and push:

$ sudo docker commit 7e0fbd9806c9 yegor256/beta
6cbfb7a6b18a2182f42171f6bb5aef67c4819b5c2795edffa6a63ba78aaada2d
$ sudo docker push yegor256/beta
...

Thus, this Docker hub is a very convenient feature for Rultor and similar systems.

Versioning

As you saw in the example above, every change to a Docker image has its own version (hash) and it's possible to track changes. It is also possible to roll back to any particular change.

Rultor is not using this functionality itself, but Rultor users are able to control their build configurations with much better precision.

Application-Centric

Docker, unlike LXC or Vagrant, for example, is application-centric. This means that when we start a container---we start an application. With other virtualization technologies, when you get a virtual machine---you get a fully functional Unix environment, where you can login through SSH and do whatever you want.

Docker makes things simpler. It doesn't give you SSH access to container, but runs an application inside and shows you its output. This is exactly what we need in Rultor. We need to run an automated build (for example Maven or Bundler), see its output and get its exit code. If the code is not zero, we fail the build and report to the user.

This is how we run Maven build:

$ sudo docker run --rm -i -t yegor256/rultor mvn clean test
[INFO] ------------------------------------------------------------------------
[INFO] Building jcabi-github 0.13
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ jcabi-github ---
[INFO]
...

As you can see, Maven starts immediately. We don't worry about the internals of the container. We just start an application inside it.

Furthermore, thanks to the --rm option, the container gets destroyed immediately after Maven execution is finished.

This is what application-centric is about.

Our overall impression of Docker is highly positive.

ps. A compact version of this article was published at devops.com

© Yegor Bugayenko 2014–2018

Rultor.com, a Merging Bot

QR code

Rultor.com, a Merging Bot

badge

You get a GitHub pull request. You review it. It looks correct---it's time to merge it into master. You post a comment in it, asking @rultor to test and merge. Rultor starts a new Docker container, merges the pull request into master, runs all tests and, if everything looks clean---merges, pushes, and closes the request.

Then, you ask @rultor to deploy the current version to production environment. It checks out your repository, starts a new Docker container, executes your deployment scripts and reports to you right there in the GitHub issue.

Why not Jenkins or Travis?

There are many tools on the market, which automate continuous integration and continuous delivery (let's call them DevOps). For example, downloadable open source Jenkins and hosted Travis both perform these tasks. So, why do we need one more?

Well, there are three very important features that we need for our projects, but we can't find all of them in any of the DevOps tools currently available on the market:

  • Merging. We make master branch read-only in our projects, as this article recommends. All changes into master we pass through a script that validates them and merges.

  • Docker. Every build should work in its own Docker container, in order to simplify configuration, isolate resources and make errors easily reproducible.

  • Tell vs. Trigger. We need to communicate with DevOps tool through commands, right from our issue tracking system (GitHub issues, in most projects). All existing DevOps systems trigger builds on certain conditions. We need our developers to be able to talk to the tool, through human-like commands in the tickets they are working with.

A combination of these three features is what differs Rultor from all other existing systems.

How Rultor Merges

Once Rultor finds a merge command in one of your GitHub pull requests, it does exactly this:

  1. Reads the .rultor.yml YAML configuration file from the root directory of your repository.

  2. Gets automated build execution command from it, for example bundle test.

  3. Checks out your repository into a temporary directory on one of its servers.

  4. Merges pull request into master branch.

  5. Starts a new Docker container and runs bundle test in it.

  6. If everything is fine, pushes modified master branch to GitHub.

  7. Reports back to you, in the GitHub pull request.

You can see it in action, for example, in this pull request: jcabi/jcabi-github#878.

© Yegor Bugayenko 2014–2018

Master Branch Must Be Read-Only

QR code

Master Branch Must Be Read-Only

Continuous integration is easy. Download Jenkins, install, create a job, click the button, and get a nice email saying that your build is broken (I assume your build is automated). Then, fix broken tests (I assume you have tests), and get a much better looking email saying that your build is clean.

Then, tweet about it, claiming that your team is using continuous integration.

Then, in a few weeks, start filtering out Jenkins alerts, into their own folder, so that they don't bother you anymore. Anyway, your team doesn't have the time or desire to fix all unit tests every time someone breaks them.

After all, we all know that unit testing is not for a team working with deadlines, right?

Wrong. Continuous integration can and must work.

What is Continuous Integration?

Nowadays, software development is done in teams. We develop in feature branches and isolate changes while they are in development. Then, we merge branches into master. After every merge, we test the entire product, executing all available unit and integration tests. This is called continuous integration (aka "CI").

Sometimes, some tests fail. When this happens, we say that our "build is broken." Such a failure is a positive side effect of quality control because it raises a red flag immediately after an error gets into master.

It is a well-known practice, when fixing that error becomes a top priority for its author and the entire team. The error should be fixed right after a red flag is raised by the continuous integration server.

badge

Continuous Delivery by Jez Humble et. al. explains this approach perfectly in Chapter 7, pages 169–186.

There are a few good tools on the market, which automate DevOps procedures. Some of them are open source, you can download and install them on your own servers. For example: Jenkins, Go, and CruiseControl. Some of them are available as a service in cloud, such as: Travis, Shippable, Wercker, and many others.

Why Continuous Integration Doesn't Work?

CI is great, but the bigger the team (and the code base), the more often builds get broken. And, the longer it takes to fix them. I've seen many examples where a hard working team starts to ignore red flags, raised by Jenkins, after a few weeks or trying to keep up.

The team simply becomes incapable of fixing all errors in time. Mostly because the business has other priorities. Product owners do not understand the importance of a "clean build" and technical leaders can't buy time for fixing unit tests. Moreover, the code that broke them was already in master and, in most cases, has been already deployed to production and delivered to end-users. What's the urgency of fixing some tests if business value was already delivered?

In the end, most development teams don't take continuous integration alerts seriously. Jenkins or Travis are just fancy tools for them that play no role in the entire development and delivery pipeline. No matter what continuous integration server says, we still deliver new features to our end-users. We'll fix our build later. And it's only logical.

What Is a Solution?

Four years ago, in 2010, I published an article in php|Architect called "Prevent Conflicts in Distributed Agile PHP Projects." In the article, a solution was proposed (full article in PDF) for Subversion and PHP.

Since that time, I used experimentally that approach in multiple open source projects and a few commercial ones with PHP, Java, Ruby and JavaScript, Git and Subversion. In all cases, my experience was only positive, and that's why rultor.com was born (later about that though).

So, the solution is simple---prohibit anyone from merging anything into master and create a script that anyone can call. The script will merge, test, and commit. The script will not make any exceptions. If any branch is breaking at even one unit test, the entire branch will be rejected.

In other words, we should raise that red flag before the code gets into master. We should put the blame for broken tests on the shoulders of its author.

Say, I'm developing a feature in my own branch. I finished the development and broke a few tests, accidentally. It happens, we all make mistakes. I can't merge my changes into master. Git simply rejects my push, because I don't have the appropriate permissions. All I can do is call a magic script, asking it to merge my branch. The script will try to merge, but before pushing into master, it will run all tests. And if any of them break, my branch will be rejected. My changes won't be merged. Now it's my responsibility---to fix them and call the script again.

In the beginning, this approach slows down the development, because everybody has to start writing cleaner code. At the end, though, this method pays off big time.

Pre-flight Builds

Some CI servers offer pre-flight builds feature, which means testing branches before they get merged into master. Travis, for example, has this feature and it is very helpful. When you make a new commit to a branch, Travis immediately tries to build it, and reports in GitHub pull request, if there are problems.

Pay attention, pre-flight builds don't merge. They just check whether your individual branch is clean. After merge, it can easily break master. And, of course, this mechanism doesn't guarantee that no collaborators can commit directly to master, breaking it accidentally. Pre-flight builds are a preventive measure, but do not solve the problem entirely.

Rultor.com

In order to start working as explained above, all you have to do is to revoke write permissions to master branch (or /trunk, in Subversion).

Unfortunately, this is not possible in GitHub. The only solution is to work through forks and pull requests only. Simply remove everybody from the list of "collaborators" and they will have to submit changes through pull requests.

badge

Then, start using Rultor.com, which will help you to test, merge and push every pull request. Basically, Rultor is the script we were talking about above. It is available as a free cloud service.

ps. A short version of this article is also published at devops.com

© Yegor Bugayenko 2014–2018

Liquibase with Maven

QR code

Liquibase with Maven

Liquibase is a migration management tool for relational databases. It versionalizes schema and data changes in a database; similar to the way Git or SVN works for source code. Thanks to their Maven plugin, Liquibase can be used as a part of a build automation scenario.

Maven Plugin

Let's assume you're using MySQL (PostgreSQL or any other database configuration will be very similar.)

Add liquibase-maven-plugin to your pom.xml (get its latest version in Maven Central):

<project>
  [...]
  <build>
    [...]
    <plugins>
      <plugin>
        <groupId>org.liquibase</groupId>
        <artifactId>liquibase-maven-plugin</artifactId>
        <configuration>
          <changeLogFile>
            ${basedir}/src/main/liquibase/master.xml
          </changeLogFile>
          <driver>com.mysql.jdbc.Driver</driver>
          <url>jdbc:mysql://${mysql.host}:${mysql.port}/${mysql.db}</url>
          <username>${mysql.login}</username>
          <password>${mysql.password}</password>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

To check that it works, run mvn liquibase:help.

I would recommend you keep database credentials in settings.xml and in their respective profiles. For example:

<settings>
  <profiles>
    <profile>
      <id>production</id>
      <properties>
        <mysql.host>db.example.com</mysql.host>
        <mysql.port>3306</mysql.port>
        <mysql.db>example</mysql.db>
      </properties>
    </profile>
    <profile>
      <id>test</id>
      <properties>
        <mysql.host>test-db.example.com</mysql.host>
        <mysql.port>3306</mysql.port>
        <mysql.db>example-db</mysql.db>
      </properties>
    </profile>
  </profiles>
</settings>

When you run Maven, don't forget to turn on one of the profiles. For example: mvn -Pproduction.

Initial Schema

I assume you already have a database with a schema (tables, triggers, views, etc.) and some data. You should "reverse engineer" it and create an initial schema file for Liquibase. In other words, we should inform Liquibase where we are at the moment, so that it starts to apply changes from this point.

Maven plugin doesn't support it, so you will have to run Liquibase directly. But, it's not that difficult. First, run mvn liquibase:help in order to download all artifacts. Then, replace placeholders with your actual credentials:

$ java -jar \
  ~/.m2/repository/org/liquibase/liquibase-core/3.1.1/liquibase-core-3.1.1.jar \
  --driver=com.mysql.jdbc.Driver \
  --url=jdbc:mysql://db.example.com:3306/example \
  --username=example --password=example \
  generateChangeLog > src/main/liquibase/2014/000-initial-schema.xml

Liquibase will analyze your current database schema and copy its own schema into src/main/liquibase/2014/000-initial-schema.xml.

Master Changeset

Now, create XML master changeset and save it to src/main/liquibase/master.xml:

<databaseChangeLog
  xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
    http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-2.0.xsd">
  <includeAll path="src/main/liquibase/2014" />
</databaseChangeLog>

It is an entry point for Liquibase. It starts from this file and loads all other changesets available in src/main/liquibase/2014. They should be either .xml or .sql. I recommend that you use XML mostly because it is easier to maintain and works faster.

Incremental Changesets

Let's create a simple changeset, which adds a new column to an existing table:

<databaseChangeLog xmlns='http://www.liquibase.org/xml/ns/dbchangelog'
  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
  xsi:schemaLocation='http://www.liquibase.org/xml/ns/dbchangelog
    http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-2.0.xsd'>
  <changeSet id="002" author="Yegor">
    <sql>
      ALTER TABLE user ADD COLUMN address VARCHAR(1024);
    </sql>
  </changeSet>
</databaseChangeLog>

We save this file in src/main/liquibase/2014/002-add-user-address.xml. In big projects, you can name your files by the names of the tickets they are produced in. For example, 045-3432.xml, which means changeset number 45 coming from ticket #3432.

The important thing is to have this numeric prefix in front of file names, in order to sort them correctly. We want changes to be applied in their correct chronological order.

That's it. We're ready to run mvn liquibase:update -Pproduction and our production database will be updated---a new column will be added to the user table.

Also, see how MySQL Maven Plugin can help you to automate integration testing of database-connected classes.

© Yegor Bugayenko 2014–2018

How to Read MANIFEST.MF Files

QR code

How to Read MANIFEST.MF Files

badge

Every Java package (JAR, WAR, EAR, etc.) has a MANIFEST.MF file in the META-INF directory. The file contains a list of attributes, which describe this particular package. For example:

Manifest-Version: 1.0
Created-By: 1.7.0_06 (Oracle Corporation)
Main-Class: MyPackage.MyClass

When your application has multiple JAR dependencies, you have multiple MANIFEST.MF files in your class path. All of them have the same location: META-INF/MANIFEST.MF. Very often it is necessary to go through all of them in runtime and find the attribute by its name.

jcabi-manifests makes it possible with a one-liner:

import com.jcabi.manifests.Manifests;
String created = Manifests.read("Created-By");

Let's see why you would want to read attributes from manifest files, and how it works on a low level.

Package Versioning

When you package a library or even a web application, it is a good practice to add an attribute to its MANIFEST.MF with the package version name and build number. In Maven, maven-jar-plugin can help you (almost the same configuration for maven-war-plugin):

<plugin>
  <artifactId>maven-jar-plugin</artifactId>
  <configuration>
    <archive>
      <manifestEntries>
        <Foo-Version>${project.version}</Foo-Version>
        <Foo-Hash>${buildNumber}</Foo-Hash>
      </manifestEntries>
    </archive>
  </configuration>
</plugin>

buildnumber-maven-plugin will help you to get ${buildNumber} from Git, SVN or Mercurial:

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>buildnumber-maven-plugin</artifactId>
  <executions>
    <execution>
      <goals>
        <goal>create</goal>
      </goals>
    </execution>
  </executions>
</plugin>

After all these manipulations, MANIFEST.MF, in your JAR will contain these two extra lines (on top of all others added there by Maven by default):

Foo-Version: 1.0-SNAPSHOT
Foo-Hash: 7ef4ac3

In runtime, you can show these values to the user to help him understand which version of the product he is working with at any given moment.

Look at stateful.co, for example. At the bottom of its front page, you see the version number and Git hash. They are retrieved from MANIFEST.MF of the deployed WAR package, on every page click.

Credentials

Although this may be considered as a bad practice (see Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation by Jez Humble and David Farley), sometimes it is convenient to package production credentials right into the JAR/WAR archive during the continuous integration/delivery cycle.

For example, you can encode your PostgreSQL connection details right into MANIFEST.MF:

<plugin>
  <artifactId>maven-war-plugin</artifactId>
  <configuration>
    <archive>
      <manifestEntries>
        <Pgsql>jdbc:postgresql://${pg.host}:${pg.port}/${pg.db}</Pgsql>
      </manifestEntries>
    </archive>
  </configuration>
</plugin>

Afterwards, you can retrieve them in runtime using jcabi-manifests:

String url = Manifests.read("Pgsql");

If you know of any other useful purposes for MANIFEST.MF, let me know :)

© Yegor Bugayenko 2014–2018

Custom Pygments Lexer in Jekyll

QR code

Custom Pygments Lexer in Jekyll

I needed to create a custom syntax highlighting for requs.org on which I'm using Jekyll for site rendering.

This is how my code blocks look in markdown pages:

{ % highlight requs %}
User is a "human being."
{ % endhighlight %}

I created a custom Pygments lexer:

from pygments.lexer import RegexLexer
from pygments.token import Punctuation, Text, Keyword, Name, String
from pygments.util import shebang_matches
class RequsLexer(RegexLexer):
  name = 'requs'
  aliases = ['requs']
  tokens = {
    'root': [
      (r'"[^"]+"', String),
      (r'""".+"""', Text),
      (r'\b(needs|includes|requires|when|fail|is|a|the)\s*\b', Keyword),
      (r'([A-Z][a-z]+)+', Name),
      (r'[,;:]', Punctuation),
    ],
  }
  def analyse_text(text):
    return shebang_matches(text, r'requs')

Then, I packaged it for easy_install and installed locally:

$ easy_install src/requs_pygment
Processing requs_pygment
Running setup.py -q bdist_egg --dist-dir ...
zip_safe flag not set; analyzing archive contents...
Adding requs-pygment 0.1 to easy-install.pth file
Installed /Library/Python/2.7/site-packages/requs_pygment-0.1-py2.7.egg
Processing dependencies for requs-pygment==0.1
Finished processing dependencies for requs-pygment==0.1

It's done. Now I run jekyll build and my syntax is highlighted according to the custom rules I specified in the lexer.

© Yegor Bugayenko 2014–2018

SASS in Java Webapp

QR code

SASS in Java Webapp

SASS is a powerful and very popular language for writing CSS style sheets. This is how I'm using SASS in my Maven projects.

First, I change the extensions of .css files to .scss and move them from src/main/webapp/css to src/main/scss.

Then, I configure the sass-maven-plugin (get its latest versions in Maven Central):

<plugin>
  <groupId>nl.geodienstencentrum.maven</groupId>
  <artifactId>sass-maven-plugin</artifactId>
  <executions>
    <execution>
      <id>generate-css</id>
      <phase>generate-resources</phase>
      <goals>
        <goal>update-stylesheets</goal>
      </goals>
      <configuration>
        <sassSourceDirectory>${basedir}/src/main/scss</sassSourceDirectory>
        <destination>${project.build.directory}/css</destination>
      </configuration>
    </execution>
  </executions>
</plugin>

The SASS compiler will compile .scss files from src/main/scss and place .css files into target/css.

Then, I configure the minify-maven-plugin to compress/minify the style sheets produced by the SASS compiler:

<plugin>
  <groupId>com.samaxes.maven</groupId>
  <artifactId>minify-maven-plugin</artifactId>
  <configuration>
    <charset>UTF-8</charset>
    <nosuffix>true</nosuffix>
    <webappTargetDir>${project.build.directory}/css-min</webappTargetDir>
  </configuration>
  <executions>
    <execution>
      <id>minify-css</id>
      <goals>
        <goal>minify</goal>
      </goals>
      <configuration>
        <webappSourceDir>${project.build.directory}</webappSourceDir>
        <cssSourceDir>css</cssSourceDir>
        <cssSourceIncludes>
          <include>*.css</include>
        </cssSourceIncludes>
        <skipMerge>true</skipMerge>
      </configuration>
    </execution>
  </executions>
</plugin>

Minified .css files will be placed into target/css-min.

The final step is to configure the maven-war-plugin to pick up .css files and package them into the final WAR archive:

<plugin>
  <artifactId>maven-war-plugin</artifactId>
  <configuration>
    [..other configuration options..]
    <webResources combine.children="append">
      <resource>
        <directory>${project.build.directory}/css-min</directory>
      </resource>
    </webResources>
  </configuration>
</plugin>

That's it.

© Yegor Bugayenko 2014–2018

XML+XSLT in a Browser

QR code

XML+XSLT in a Browser

Separating data and their presentation is a great concept. Take HTML and CSS for example. HTML is supposed to have pure data and CSS is supposed to format that data in order to make it readable by a human. Years ago, that was probably the intention of HTML/CSS, but in reality it doesn't work like that. Mostly because CSS is not powerful enough.

We still have to format our data using HTML tags, while CSS can help slightly with positioning and decorating.

On the other hand, XML with XSLT implements perfectly the idea of separating data and presentation. XML documents, like HTML, are supposed to contain data only without any information about positioning or formatting. XSL stylesheets position and decorate the data. XSL is a much more powerful language. That's why it's possible to avoid any formatting inside XML.

The latest versions of Chrome, Safari, Firefox and IE all support this mechanism. When a browser retrieves an XML document from a server, and the document has an XSL stylesheet associated with it---the browser transforms XML into HTML on-fly.

Working Example

Let's review a simple Java web application that works this way. It is using Takes Framework that makes this mechanism possible. In the next post, I'll explain how ReXSL works. For now, though, let's focus on the idea of delivering bare data in XML and formatting it with an XSL stylesheet.

Open http://www.stateful.co---it is a collection of stateful web primitives, explained in the Atomic Counters at Stateful.co article.

Open it in Chrome or Safari. When you do, you should see a normal web page with a logo, some text, some links, a footer, etc. Now check its sources (I assume you know how to do this).

This is approximately what you will see (I assume you understand XML, if not, start learning it immediately):

<?xml-stylesheet type='text/xsl' href='/xsl/index.xsl'?>
<page date="2014-06-15T15:30:49.521Z" ip="10.168.29.135">
  <menu>home</menu>
  <documentation>.. some text here ..</documentation>
  <version>
    <name>1.4</name>
    <revision>5c7b5af</revision>
    <date>2014-05-29 07:58</date>
  </version>
  <links>
    <link href="..." rel="rexsl:google" type="text/xml"/>
    <link href="..." rel="rexsl:github" type="text/xml"/>
    <link href="..." rel="rexsl:facebook" type="text/xml"/>
  </links>
  <millis>70</millis>
</page>

As you see, it is a proper XML document with attributes, elements and data. It contains absolutely no information about how its elements have to be presented to an end-user. Actually, this document is more suitable for machine parsing instead of reading by a human.

The document contains data, which is important for its requester. It's up to the requester on how to render the data or to not render it at all.

Its second line associates the document with the XSL stylesheet /xsl/index.xsl that is loaded by the browser separately:

<?xml-stylesheet type='text/xsl' href='/xsl/index.xsl'?>

Open developer tools in Chrome and you will see that right after the page is loaded, the browser loads the XSL stylesheet and then all other resources including a few CSS stylesheets, jQuery and an SVG logo:

The figure

index.xsl includes layout.xsl, that's why it is loaded right after.

Let's consider an example of index.xsl (in reality it is much more complex, check layout.xsl. For example:

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns="http://www.w3.org/1999/xhtml">
  <xsl:template match="page">
    <html>
      <body>
        <p>
          Current version of the application is
          <xsl:value-of select="version/name"/>
        </p>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

I think it's obvious how the HTML page will look like after applying this XSL stylesheet to our XML document.

For me, this XSL looks clean and easy to understand. However, I often hear people say that XSLT is a hard-to-understand programming language. I don't find it hard to understand at all. Of course, I'm not using all of its features. But, for simple page rendering, all I need to know are a few simple commands and the principle of XML transformation.

Why Not a Templating Engine?

Now, why is this approach better than all that widely use Java templating engines, including JSP, JSF, Velocity, FreeMarker, Tiles, etc?

Well, I see a number of reasons. But, the most important are:

  1. Web UI and API are same pages. There is no need to develop separate pages for RESTful API---Web user interface, being accessed by a computer, is an API. In my experience, this leads to massive avoidance of code duplication.

  2. XSL is testable by itself without a server. In order to test how our web site will look with certain data, we just create a new XML document with necessary test data, associate it with an XSL and open it in a browser. We can also modify XML and refresh the page in browser. This makes the work of HTML/CSS designer much easier and independent of programmers.

  3. XSL is a powerful functional language. Compared with all other templating engines, which look mostly like workarounds, XSL is a complete and well-designed environment. Writing XSL (after you get used to its syntax and programming concepts) is a pleasure in itself. You're not injecting instructions into a HTML document (like in JSP and all others). Instead, you are programming transformation of data into presentation---a different mindset and much better feeling.

  4. XML output is perfectly testable. A controller in MVC that generates an XML document with all data required for the XSL stylesheet can easily be tested in a single unit test using simple XPath expressions. Testing of a controller that injects data into a templating engine is a much more complex operation---even impossible sometimes. I'm also writing in PHP and Ruby. They have exactly the same problems---even though their templating engines are much more powerful due to the interpretation nature of the languages.

Is It Fully Supported?

Everything would be great if all browsers would support XML+XSL rendering. However, this is far from being true. Only the latest versions of modern browsers support XSL. Check this comparison done by Julian Reschke. Besides that, XSLT 2.0 is not supported at all.

There is a workaround, though. We can understand which browser is making a request (via its User-Agent HTTP header) and transform XML into HTML on the server side. Thus, for modern browsers that support XSL, we will deliver XML and for all others---HTML.

This is exactly how ReXSL framework works. Open http://www.stateful.co in Internet Explorer and you will see an HTML document, not an XML document as is the case with Chrome.

BTW, see how all this is implemented: XML Data and XSL Views in Takes Framework.

Read this one, it continues the discussion of this subject: RESTful API and a Web Site in the Same URL

© Yegor Bugayenko 2014–2018

Deploy Jekyll to GitHub Pages

QR code

Deploy Jekyll to GitHub Pages

This blog is written in Jekyll and is hosted at GitHub Pages. It uses half a dozen custom plugins, which are not allowed there.

Here is how I deploy it:

$ jgd

That's it. jgd is my Ruby gem (stands for "Jekyll GitHub Deploy"), which does the trick. Here is what it does:

  1. It clones your existing repository from the current directory to a temporary one (guessing the URL of the repo from .git/config file).

  2. Runs jekyll build in that temporary directory, which saves the output in another temporary directory.

  3. Checks out gh-pages branch or creates one if it doesn't exist.

  4. Copies the content of the site built by jekyll build into the branch, thus overwriting existing files, commits and pushes to GitHub.

  5. Cleans up all temporary directories.

Using this gem is very easy. Just install it with gem install jgd and then run in the root directory of your Jekyll blog.

What is important is that your Jekyll site files be located in the root directory of the repository. Just as they do on this blog; see its sources in GitHub.

You can easily integrate jgd with Travis. See .travis.yml of this blog.

Full documentation about the gem is located here.

© Yegor Bugayenko 2014–2018

CasperJS Tests in Maven Build

QR code

CasperJS Tests in Maven Build

I'm a big fan of automated testing in general and integration testing in particular. I strongly believe that effort spent on writing tests are direct investments into quality and stability of the product under development.

CasperJS is a testing framework on top of PhantomJS, which is a headless browser. Using CasperJS, we can ensure that our application responds correctly to requests sent by a regular web browser.

This is a sample CasperJS test, which makes an HTTP request to a home page of a running WAR application and asserts that the response has 200 HTTP status code:

casper.test.begin(
  'home page can be rendered',
  function (test) {
    casper.start(
      casper.cli.get('home'), // URL of home page
      function () {
        test.assertHttpStatus(200);
      }
    );
    casper.run(
      function () {
        test.done();
      }
    );
  }
);

I keep this test in the src/test/casperjs/home-page.js file. Let's see how CasperJS can be executed automatically on every Maven build.

Here is the test scenario, implemented with a combination of Maven plugins:

  1. Install PhantomJS

  2. Install CasperJS

  3. Reserve a random TCP port

  4. Start Tomcat on that TCP port (with WAR inside)

  5. Run CasperJS tests and point them to the running Tomcat

  6. Shutdown Tomcat

I'm using a combination of plugins. Let's go through the steps one by one.

BTW, I'm not showing plugin versions in the examples below, primarily because most of them are in active development. Check their versions at Maven Central (yes, all of them are available there).

1. Install PhantomJS

First of all, we have to download the PhantomJS executable. It is a platform-specific binary. Thanks to Kyle Lieber, we have an off-the-shelf Maven plugin: phantomjs-maven-plugin that understands what the current platform is and downloads the appropriate binary automatically, placing it into the target directory.

<plugin>
  <groupId>com.github.klieber</groupId>
  <artifactId>phantomjs-maven-plugin</artifactId>
  <executions>
    <execution>
      <goals>
        <goal>install</goal>
      </goals>
    </execution>
  </executions>
  <configuration>
    <version>1.9.2</version>
  </configuration>
</plugin>

The exact name of the downloaded binary is stored in the ${phantomjs.binary} Maven property.

2. Install CasperJS

Unfortunately, there is no similar plugin for the CasperJS installation (at least I haven't found any as of yet). That's why I'm using plain old git (you should have it installed on your build machine).

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>exec-maven-plugin</artifactId>
  <executions>
    <execution>
      <id>casperjs-install</id>
      <phase>pre-integration-test</phase>
      <goals>
        <goal>exec</goal>
      </goals>
      <configuration>
        <executable>git</executable>
        <arguments>
          <argument>clone</argument>
          <argument>--depth=1</argument>
          <argument>https://github.com/n1k0/casperjs.git</argument>
          <argument>${project.build.directory}/casperjs</argument>
        </arguments>
      </configuration>
    </execution>
  </executions>
</plugin>

3. Reserve TCP Port

I need to obtain a random TCP port where Tomcat will be started. The port has to be available on the build machine. I want to be able to run multiple Maven builds in parallel, so that's why I get a random port on every build.

In other examples, you may see people using fixed port numbers, like 5555 or something similar. This is a very bad practice. Always reserve a new random port when you need it.

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>build-helper-maven-plugin</artifactId>
  <executions>
    <execution>
      <id>tomcat-port</id>
      <goals>
        <goal>reserve-network-port</goal>
      </goals>
      <configuration>
        <portNames>
          <portName>tomcat.port</portName>
        </portNames>
      </configuration>
    </execution>
  </executions>
</plugin>

The plugin reserves a port and sets it value to the ${tomcat.port} Maven property.

4. Start Tomcat

Now, it's time to start Tomcat with the WAR package inside. I'm using tomcat7-maven-plugin that starts a real Tomcat7 server and configures it to serve on the port reserved above.

<plugin>
  <groupId>org.apache.tomcat.maven</groupId>
  <artifactId>tomcat7-maven-plugin</artifactId>
  <configuration>
    <path>/</path>
  </configuration>
  <executions>
    <execution>
      <id>start-tomcat</id>
      <phase>pre-integration-test</phase>
      <goals>
        <goal>run-war-only</goal>
      </goals>
      <configuration>
        <port>${tomcat.port}</port>
        <fork>true</fork>
      </configuration>
    </execution>
  </executions>
</plugin>

Due to the option fork being set to true, Tomcat7 continues to run when the plugin execution finishes. That's exactly what I need.

5. Run CasperJS

Now, it's time to run CasperJS. Even though there are some plugins exist for this, I'm using plain old exec-maven-plugin, mostly because it is more configurable.

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>exec-maven-plugin</artifactId>
  <executions>
    <execution>
      <id>casperjs-test</id>
      <phase>integration-test</phase>
      <goals>
        <goal>exec</goal>
      </goals>
      <configuration>
        <executable>
          ${project.build.directory}/casperjs/bin/casperjs
        </executable>
        <workingDirectory>${basedir}</workingDirectory>
        <arguments>
          <argument>test</argument>
          <argument>--verbose</argument>
          <argument>--no-colors</argument>
          <argument>--concise</argument>
          <argument>--home=http://localhost:${tomcat.port}</argument>
          <argument>${basedir}/src/test/casperjs</argument>
        </arguments>
        <environmentVariables>
          <PHANTOMJS_EXECUTABLE>${phantomjs.binary}</PHANTOMJS_EXECUTABLE>
        </environmentVariables>
      </configuration>
    </execution>
  </executions>
</plugin>

The environment variable PHANTOMJS_EXECUTABLE is the undocumented feature that makes this whole scenario possible. It configures the location of the PhantomJS executable, which was downloaded a few steps above.

6. Shutdown Tomcat

In the last step, I shut down the Tomcat server.

<plugin>
  <groupId>org.apache.tomcat.maven</groupId>
  <artifactId>tomcat7-maven-plugin</artifactId>
  <executions>
    <execution>
      <id>stop-tomcat</id>
      <phase>post-integration-test</phase>
      <goals>
        <goal>shutdown</goal>
      </goals>
    </execution>
  </executions>
</plugin>

Real Example

If you want to see how this all works in action, take a look at stateful.co. It is a Java Web application hosted at CloudBees. Its source code is open and available in GitHub.

Its pom.xml contains exactly the same configurations explained above, but joined together.

If you have any questions, please don't hesitate to ask below.

PS. Also, check this: PhantomJS as an HTML Validator

© Yegor Bugayenko 2014–2018

Limit Java Method Execution Time

QR code

Limit Java Method Execution Time

badge

Say, you want to allow a Java method to work for a maximum of five seconds and want an exception to be thrown if the timeframe is exceeded. Here is how you can do it with jcabi-aspects and AspectJ:

public class Resource {
  @Timeable(limit = 5, unit = TimeUnit.SECONDS)
  public String load(URL url) {
    return url.openConnection().getContent();
  }
}

Keep in mind that you should weave your classes after compilation, as explained here.

Let's discuss how this actually works, but first, I recommend you read this post, which explains how AOP aspects work together with Java annotations.

Due to @Timeable annotation and class weaving, every call to a method load() is intercepted by an aspect from jcabi-aspects. That aspect starts a new thread that monitors the execution of a method every second, checking whether it is still running.

If the method runs for over five seconds, the thread calls interrupt() on the method's thread.

Despite a very common expectation that a thread should be terminated immediately on that call, it is not happening at all. This article explains the mechanism in more detail. Let's discuss it briefly:

  1. interrupt() sets a marker in a thread;

  2. The thread checks interrupted() as often as it can;

  3. If the marker is set, the thread stops and throws InterruptedException

This method will not react to interrupt() call and will work until JVM is killed (very bad design):

public void work() {
  while (true) {
    // do something
  }
}

This is how we should refactor it in order to make sensitive to interruption requests:

public void work() {
  while (true) {
    if (Thread.interruped()) {
      throw new InterruptedException();
    }
    // do something
  }
}

In other words, your method can only stop itself. Nothing else can do it. The thread it is running in can't be terminated by another thread. The best thing that the other thread can do is to send your thread a "message" (through interrupt() method) that it's time to stop. If your thread ignores the message, nobody can do anything.

Most I/O operations in JDK are designed this way. They check the interruption status of their threads while waiting for I/O resources.

Thus, use @Timeable annotation, but keep in mind that there could be situations when a thread can't be interrupted.

© Yegor Bugayenko 2014–2018

Avoid String Concatenation

QR code

Avoid String Concatenation

This is "string concatenation," and it is a bad practice:

// bad practice, don't reuse!
String text = "Hello, " + name + "!";

Why? Some may say that it is slow, mostly because parts of the resulting string are copied multiple times. Indeed, on every + operator, String class allocates a new block in memory and copies everything it has into it; plus a suffix being concatenated. This is true, but this is not the point here.

Actually, I don't think performance in this case is a big issue. Moreover, there were multiple experiments showing that concatenation is not that slow when compared to other string building methods and sometimes is even faster.

Some say that concatenated strings are not localizable because in different languages text blocks in a phrase may be positioned in a different order. The example above can't be translated to, say, Russian, where we would want to put a name in front of "привет." We will need to localize the entire block of code, instead of just translating a phrase.

However, my point here is different. I strongly recommend avoiding string concatenation because it is less readable than other methods of joining texts together.

Let's see these alternative methods. I'd recommend three of them (in order of preference): String.format(), Apache StringUtils and Guava Joiner.

There is also a StringBuilder, but I don't find it as attractive as StringUtils. It is a useful builder of strings, but not a proper replacer or string concatenation tool when readability is important.

String.format()

String.format() is my favorite option. It makes text phrases easy to understand and modify. It is a static utility method that mirrors sprintf() from C. It allows you to build a string using a pattern and substitutors:

String text = String.format("Hello, %s!", name);

When the text is longer, the advantages of the formatter become much more obvious. Look at this ugly code:

String msg = "Dear " + customer.name()
  + ", your order #" + order.number()
  + " has been shipped at " + shipment.date()
  + "!";

This one looks much more beautiful doesn't it:

String msg = String.format(
  "Dear %1$s, your order #%2$d has been shipped at %3$tR!",
  customer.name(), order.number(), shipment.date()
);

Please note that I'm using argument indexes in order to make the pattern even more localizable. Let's say, I want to translate it to Greek. This is how will it look:

Αγαπητέ %1$s, στις %3$tR στείλαμε την παραγγελία σου με αριθμό #%2$d!

I'm changing the order of substitutions in the pattern, but not in the actual list of methods arguments.

Apache StringUtils.join()

When the text is rather long (longer than your screen width), I would recommend that you use the utility class StringUtils from Apache commons-lang3:

import org.apache.commons.lang3.StringUtils;
String xml = StringUtils.join(
  "<?xml version='1.0'?>",
  "<html><body>",
  "<p>This is a test XHTML document,",
  " which would look ugly,",
  " if we would use a single line,"
  " or string concatenation or String format().</p>"
  "</body></html>"
);

The need to include an additional JAR dependency to your classpath may be considered a downside with this method (get its latest versions in Maven Central):

<dependency>
  <groupId>org.apache.commons</groupId>
  <artifactId>commons-lang3</artifactId>
</dependency>

Guava Joiner

Similar functionality is provided by Joiner from Google Guava:

import com.google.common.base.Joiner;
String text = Joiner.on('').join(
  "WE HAVE BUNNY.\n",
  "GATHER ONE MILLION DOLLARS IN UNMARKED ",
  "NON-CONSECUTIVE TWENTIES.\n",
  "AWAIT INSTRUCTIONS.\n",
  "NO FUNNY STUFF"
);

It is a bit less convenient than StringUtils since you always have to provide a joiner (character or a string placed between text blocks).

Again, a dependency is required in this case:

<dependency>
  <groupId>com.google.guava</groupId>
  <artifactId>guava</artifactId>
</dependency>

Yes, in most cases, all of these methods work slower than a plain simple concatenation. However, I strongly believe that computers are cheaper than people. What I mean is that the time spent by programmers understanding and modifying ugly code is much more expensive than a cost of an additional server that will make beautifully written code work faster.

If you know any other methods of avoiding string concatenation, please comment below.

© Yegor Bugayenko 2014–2018

Objects Should Be Immutable

QR code

Objects Should Be Immutable

In object-oriented programming, an object is immutable if its state can't be modified after it is created. In Java, a good example of an immutable object is String. Once created, we can't modify its state. We can request that it creates new strings, but its own state will never change.

However, there are not so many immutable classes in JDK. Take, for example, class Date. It is possible to modify its state using setTime().

I don't know why the JDK designers decided to make these two very similar classes differently. However, I believe that the design of a mutable Date has many flaws, while the immutable String is much more in the spirit of the object-oriented paradigm.

Moreover, I think that all classes should be immutable in a perfect object-oriented world. Unfortunately, sometimes, it is technically not possible due to limitations in JVM. Nevertheless, we should always aim for the best.

This is an incomplete list of arguments in favor of immutability:

  • immutable objects are simpler to construct, test, and use
  • truly immutable objects are always thread-safe
  • they help to avoid temporal coupling
  • their usage is side-effect free (no defensive copies)
  • identity mutability problem is avoided
  • they always have failure atomicity
  • they are much easier to cache
  • they prevent NULL references, which are bad

Let's discuss the most important arguments one by one.

Thread Safety

The first and the most obvious argument is that immutable objects are thread-safe. This means that multiple threads can access the same object at the same time, without clashing with another thread.

badge

If no object methods can modify its state, no matter how many of them and how often are being called parallel---they will work in their own memory space in stack.

Goetz et al. explained the advantages of immutable objects in more details in their very famous book Java Concurrency in Practice (highly recommended).

Avoiding Temporal Coupling

Here is an example of temporal coupling (the code makes two consecutive HTTP POST requests, where the second one contains HTTP body):

Request request = new Request("http://localhost");
request.method("POST");
String first = request.fetch();
request.body("text=hello");
String second = request.fetch();

This code works. However, you must remember that the first request should be configured before the second one may happen. If we decide to remove the first request from the script, we will remove the second and the third line, and won't get any errors from the compiler:

Request request = new Request("http://localhost");
// request.method("POST");
// String first = request.fetch();
request.body("text=hello");
String second = request.fetch();

Now, the script is broken although it compiled without errors. This is what temporal coupling is about---there is always some hidden information in the code that a programmer has to remember. In this example, we have to remember that the configuration for the first request is also used for the second one.

We have to remember that the second request should always stay together and be executed after the first one.

If Request class were immutable, the first snippet wouldn't work in the first place, and would have been rewritten like:

final Request request = new Request("");
String first = request.method("POST").fetch();
String second = request.method("POST").body("text=hello").fetch();

Now, these two requests are not coupled. We can safely remove the first one, and the second one will still work correctly. You may point out that there is a code duplication. Yes, we should get rid of it and re-write the code:

final Request request = new Request("");
final Request post = request.method("POST");
String first = post.fetch();
String second = post.body("text=hello").fetch();

See, refactoring didn't break anything and we still don't have temporal coupling. The first request can be removed safely from the code without affecting the second one.

I hope this example demonstrates that the code manipulating immutable objects is more readable and maintainable, because it doesn't have temporal coupling.

Avoiding Side Effects

Let's try to use our Request class in a new method (now it is mutable):

public String post(Request request) {
  request.method("POST");
  return request.fetch();
}

Let's try to make two requests---the first with GET method and the second with POST:

Request request = new Request("http://localhost");
request.method("GET");
String first = this.post(request);
String second = request.fetch();

Method post() has a "side effect"---it makes changes to the mutable object request. These changes are not really expected in this case. We expect it to make a POST request and return its body. We don't want to read its documentation just to find out that behind the scene it also modifies the request we're passing to it as an argument.

Needless to say, such side effects lead to bugs and maintainability issues. It would be much better to work with an immutable Request:

public String post(Request request) {
  return request.method("POST").fetch();
}

In this case, we may not have any side effects. Nobody can modify our request object, no matter where it is used and how deep through the call stack it is passed by method calls:

Request request = new Request("http://localhost").method("GET");
String first = this.post(request);
String second = request.fetch();

This code is perfectly safe and side effect free.

Avoiding Identity Mutability

Very often, we want objects to be identical if their internal states are the same. Date class is a good example:

Date first = new Date(1L);
Date second = new Date(1L);
assert first.equals(second); // true

There are two different objects; however, they are equal to each other because their encapsulated states are the same. This is made possible through their custom overloaded implementation of equals() and hashCode() methods.

The consequence of this convenient approach being used with mutable objects is that every time we modify object's state it changes its identity:

Date first = new Date(1L);
Date second = new Date(1L);
first.setTime(2L);
assert first.equals(second); // false

This may look natural, until you start using your mutable objects as keys in maps:

Map<Date, String> map = new HashMap<>();
Date date = new Date();
map.put(date, "hello, world!");
date.setTime(12345L);
assert map.containsKey(date); // false

When modifying the state of date object, we're not expecting it to change its identity. We're not expecting to lose an entry in the map just because the state of its key is changed. However, this is exactly what is happening in the example above.

When we add an object to the map, its hashCode() returns one value. This value is used by HashMap to place the entry into the internal hash table. When we call containsKey() hash code of the object is different (because it is based on its internal state) and HashMap can't find it in the internal hash table.

It is a very annoying and difficult to debug side effects of mutable objects. Immutable objects avoid it completely.

Failure Atomicity

Here is a simple example:

public class Stack {
  private int size;
  private String[] items;
  public void push(String item) {
    size++;
    if (size > items.length) {
      throw new RuntimeException("stack overflow");
    }
    items[size] = item;
  }
}

It is obvious that an object of class Stack will be left in a broken state if it throws a runtime exception on overflow. Its size property will be incremented, while items won't get a new element.

badge

Immutability prevents this problem. An object will never be left in a broken state because its state is modified only in its constructor. The constructor will either fail, rejecting object instantiation, or succeed, making a valid solid object, which never changes its encapsulated state.

For more on this subject, read Effective Java, 2nd Edition by Joshua Bloch.

Arguments Against Immutability

There are a number of arguments against immutability.

  1. “Immutability is not for enterprise systems”. Very often, I hear people say that immutability is a fancy feature, while absolutely impractical in real enterprise systems. As a counter-argument, I can only show some examples of real-life applications that contain only immutable Java objects: jcabi-http, jcabi-xml, jcabi-github, jcabi-s3, jcabi-dynamo, jcabi-w3c, jcabi-jdbc, jcabi-simpledb, jcabi-ssh. The above are all Java libraries that work solely with immutable classes/objects. netbout.com and stateful.co are web applications that work solely with immutable objects.

  2. “It's cheaper to update an existing object than create a new one”. Oracle thinks that “The impact of object creation is often overestimated and can be offset by some of the efficiency associated with immutable objects. These include decreased overhead due to garbage collection, and the elimination of code needed to protect mutable objects from corruption.” I agree.

If you have some other arguments, please post them below and I'll try to comment.

P.S. Check takes.org, a Java web framework that consists entirely of immutable objects.


If you like this article, you will definitely like these very relevant posts too:

Immutable Objects Are Not Dumb
Immutable objects are not the same as passive data structures without setters, despite a very common mis-belief.

How an Immutable Object Can Have State and Behavior?
Object state and behavior are two very different things, and confusing the two often leads to incorrect design.

Gradients of Immutability
There are a few levels and forms of immutability in object-oriented programming, all of which can be used when they seem appropriate.

© Yegor Bugayenko 2014–2018

Java Method Logging with AOP and Annotations

QR code

Java Method Logging with AOP and Annotations

Sometimes, I want to log (through slf4j and log4j) every execution of a method, seeing what arguments it receives, what it returns and how much time every execution takes. This is how I'm doing it, with help of AspectJ, jcabi-aspects and Java 6 annotations:

public class Foo {
  @Loggable
  public int power(int x, int p) {
    return Math.pow(x, p);
  }
}

This is what I see in log4j output:

[INFO] com.example.Foo #power(2, 10): 1024 in 12μs
[INFO] com.example.Foo #power(3, 3): 27 in 4μs

Nice, isn't it? Now, let's see how it works.

Annotation with Runtime Retention

Annotations is a technique introduced in Java 6. It is a meta-programming instrument that doesn't change the way code works, but gives marks to certain elements (methods, classes or variables). In other words, annotations are just markers attached to the code that can be seen and read. Some annotations are designed to be seen at compile time only---they don't exist in .class files after compilation. Others remain visible after compilation and can be accessed in runtime.

For example, @Override is of the first type (its retention type is SOURCE), while @Test from JUnit is of the second type (retention type is RUNTIME). @Loggable---the one I'm using in the script above---is an annotation of the second type, from jcabi-aspects. It stays with the byte-code in the .class file after compilation.

Again, it is important to understand that even though method power() is annotated and compiled, it doesn't send anything to slf4j so far. It just contains a marker saying "please, log my execution."

Aspect Oriented Programming (AOP)

AOP is a useful technique that enables adding executable blocks to the source code without explicitly changing it. In our example, we don't want to log method execution inside the class. Instead, we want some other class to intercept every call to method power(), measure its execution time and send this information to slf4j.

We want that interceptor to understand our @Loggable annotation and log every call to that specific method power(). And, of course, the same interceptor should be used for other methods where we'll place the same annotation in the future.

This case perfectly fits the original intent of AOP---to avoid re-implementation of some common behavior in multiple classes.

Logging is a supplementary feature to our main functionality, and we don't want to pollute our code with multiple logging instructions. Instead, we want logging to happen behind the scenes.

In terms of AOP, our solution can be explained as creating an aspect that cross-cuts the code at certain join points and applies an around advice that implements the desired functionality.

AspectJ

Let's see what these magic words mean. But, first, let's see how jcabi-aspects implements them using AspectJ (it's a simplified example, full code you can find in MethodLogger.java):

@Aspect
public class MethodLogger {
  @Around("execution(* *(..)) && @annotation(Loggable)")
  public Object around(ProceedingJoinPoint point) {
    long start = System.currentTimeMillis();
    Object result = point.proceed();
    Logger.info(
      "#%s(%s): %s in %[msec]s",
      MethodSignature.class.cast(point.getSignature()).getMethod().getName(),
      point.getArgs(),
      result,
      System.currentTimeMillis() - start
    );
    return result;
  }
}

This is an aspect with a single around advice around() inside. The aspect is annotated with @Aspect and advice is annotated with @Around. As discussed above, these annotations are just markers in .class files. They don't do anything except provide some meta-information to those who are interested in runtime.

Annotation @Around has one parameter, which---in this case---says that the advice should be applied to a method if:

  1. its visibility modifier is * (public, protected or private);

  2. its name is name * (any name);

  3. its arguments are .. (any arguments); and

  4. it is annotated with @Loggable

When a call to an annotated method is to be intercepted, method around() executes before executing the actual method. When a call to method power() is to be intercepted, method around() receives an instance of class ProceedingJoinPoint and must return an object, which will be used as a result of method power().

In order to call the original method, power(), the advice has to call proceed() of the join point object.

We compile this aspect and make it available in classpath together with our main file Foo.class. So far so good, but we need to take one last step in order to put our aspect into action---we should apply our advice.

Binary Aspect Weaving

Aspect weaving is the name of the advice applying process. Aspect weaver modifies original code by injecting calls to aspects. AspectJ does exactly that. We give it two binary Java classes Foo.class and MethodLogger.class; it gives back three---modified Foo.class, Foo$AjcClosure1.class and unmodified MethodLogger.class.

In order to understand which advice should be applied to which methods, AspectJ weaver is using annotations from .class files. Also, it uses reflection to browse all classes on classpath. It analyzes which methods satisfy the conditions from the @Around annotation. Of course, it finds our method power().

So, there are two steps. First, we compile our .java files using javac and get two files. Then, AspectJ weaves/modifies them and creates its own extra class. Our Foo class looks something like this after weaving:

public class Foo {
  private final MethodLogger logger;
  @Loggable
  public int power(int x, int p) {
    return this.logger.around(point);
  }
  private int power_aroundBody(int x, int p) {
    return Math.pow(x, p);
  }
}

AspectJ weaver moves our original functionality to a new method, power_aroundBody(), and redirects all power() calls to the aspect class MethodLogger.

Instead of one method power() in class Foo now we have four classes working together. From now on, this is what happens behind the scenes on every call to power():

PlantUML SVG diagram

Original functionality of method power() is indicated by the small green lifeline on the diagram.

As you see, the aspect weaving process connects together classes and aspects, transferring calls between them through join points. Without weaving, both classes and aspects are just compiled Java binaries with attached annotations.

jcabi-aspects

jcabi-aspects is a JAR library that contains Loggable annotation and MethodLogger aspect (btw, there are many more aspects and annotations). You don't need to write your own aspect for method logging. Just add a few dependencies to your classpath and configure jcabi-maven-plugin for aspect weaving (get their latest versions in Maven Central):

<project>
  <dependencies>
    <dependency>
      <groupId>com.jcabi</groupId>
      <artifactId>jcabi-aspects</artifactId>
    </dependency>
    <dependency>
      <groupId>org.aspectj</groupId>
      <artifactId>aspectjrt</artifactId>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>com.jcabi</groupId>
        <artifactId>jcabi-maven-plugin</artifactId>
        <executions>
          <execution>
            <goals>
              <goal>ajc</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Since this weaving procedure takes a lot of configuration effort, I created a convenient Maven plugin with an ajc goal, which does the entire aspect weaving job. You can use AspectJ directly, but I recommend that you use jcabi-maven-plugin.

That's it. Now you can use @com.jcabi.aspects.Loggable annotation and your methods will be logged through slf4j.

If something doesn't work as explained, don't hesitate to submit a GitHub issue.

© Yegor Bugayenko 2014–2018

Object-Oriented Java Adapter of Amazon S3 SDK

QR code

Object-Oriented Java Adapter of Amazon S3 SDK

badge

I'm a big fan of Amazon Web Services (AWS). I'm using them in almost all of my projects. One of their most popular services is Simple Storage Service (S3). It is a storage for binary objects (files) with unique names, accessible through HTTP or RESTful API.

Using S3 is very simple. You create a "bucket" with a unique name, upload your "object" into the bucket through their web interface or through RESTful API, and then download it again (either through HTTP or the API.)

Amazon ships the Java SDK that wraps their RESTful API. However, this SDK is not object-oriented at all. It is purely imperative and procedural---it just mirrors the API.

For example, in order to download an existing object doc.txt from bucket test-1, you have to do something like this:

AWSCredentials creds = new BasicAWSCredentials(key, secret);
AmazonS3 aws = new AmazonS3Client(creds);
S3Object obj = aws.getObject(
  new GetObjectRequest("test-1", "doc.txt")
);
InputStream input = obj.getObjectContent();
String content = IOUtils.toString(input, "UTF-8");
input.close();
badge

As always, procedural programming has its inevitable disadvantages. To overcome them all, I designed jcabi-s3, which is a small object-oriented adapter for Amazon SDK. This is how the same object-reading task can be accomplished with jcabi-s3:

Region region = new Region.Simple(key, secret);
Bucket bucket = region.bucket("test-1");
Ocket ocket = bucket.ocket("doc.txt");
String content = new Ocket.Text(ocket).read();

Why is this approach better? Well, there are a number of obvious advantages.

S3 Object is an Object in Java

S3 object get its representative in Java. It is not a collection of procedures to be called in order to get its properties (as with AWS SDK). Rather, it is a Java object with certain behaviors. I called them "ockets" (similar to "buckets"), in order to avoid clashes with java.lang.Object.

Ocket is an interface, that exposes the behavior of a real AWS S3 object: read, write, check existence. There is also a convenient decorator Ocket.Text that simplifies working with binary objects:

Ocket.Text ocket = new Ocket.Text(ocket_from_s3);
if (ocket.exists()) {
  System.out.print(ocket.read());
} else {
  ocket.write("Hello, world!");
}

Now, you can pass an object to another class, instead of giving it your AWS credentials, bucket name, and object name. You simply pass a Java object, which encapsulates all AWS interaction details.

Extendability Through Decoration

Since jcabi-s3 exposes all entities as interfaces, they can easily be extended through encapsulation (Decorator Pattern).

For example, you want your code to retry S3 object read operations a few times before giving up and throwing an IOException (by the way, this is a very good practice when working with web services). So, you want all your S3 reading operations to be redone a few times if first attempts fail.

You define a new decorator class, say, RetryingOcket, which encapsulates an original Ocket:

public RetryingOcket implements Ocket {
  private final Ocket origin;
  public RetryingOcket(Ocket ocket) {
    this.origin = ocket;
  }
  @Override
  public void read(OutputStream stream) throws IOException {
    int attempt = 0;
    while (true) {
      try {
        this.origin.read(stream);
      } catch (IOException ex) {
        if (attempt++ > 3) {
          throw ex;
        }
      }
    }
  }
  // same for other methods
}

Now, everywhere where Ocket is expected you send an instance of RetryingOcket that wraps your original object:

foo.process(new RetryingOcket(ocket));

Method foo.process() won't see a difference, since it is the same Ocket interface it is expecting.

By the way, this retry functionality is implemented out-of-the-box in jcabi-s3, in com.jcabi.s3.retry package.

Easy Mocking

Again, due to the fact that all entities in jcabi-s3 are interfaces, they are very easy to mock. For example, your class expects an S3 object, reads its data and calculates the MD5 hash (I'm using DigestUtils from commons-codec):

import com.jcabi.s3.Ocket;
import org.apache.commons.codec.digest.DigestUtils;
public class S3Md5Hash {
  private final Ocket ocket;
  public S3Md5Hash(Ocket okt) {
    this.ocket = okt;
  }
  public hash() throws IOException {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    this.ocket.read(baos);
    return DigestUtils.md5hex(baos.toByteArray());
  }
}

Here is how simple a unit test will look (try to create a unit test for a class using AWS SDK and you will see the difference):

import com.jcabi.s3.Ocket;
import org.junit.Test;
public class S3Md5HashTest {
  @Test
  public void generatesHash() {
    Ocket ocket = Mockito.mock(Ocket.class);
    Mockito.doAnswer(
      new Answer<Void>() {
        public Void answer(final InvocationOnMock inv) throws IOException {
          OutputStream.class.cast(inv.getArguments()[0]).write(' ');
        }
      }
    ).when(ocket).read(Mockito.any(OutputStream.class));
    String hash = new S5Md5Hash(ocket);
    Assert.assertEquals(hash, "7215ee9c7d9dc229d2921a40e899ec5f");
  }
}

I'm using JUnit and Mockito in this test.

Immutability

All classes in jcabi-s3 are annotated with @Immutable and are truly immutable.

The library ships as a JAR dependency in Maven Central (get its latest versions in Maven Central):

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-s3</artifactId>
</dependency>

As always, your comments and criticism are welcome as GitHub issues.

© Yegor Bugayenko 2014–2018

Get Rid of Java Static Loggers

QR code

Get Rid of Java Static Loggers

This is a very common practice in Java (using LoggerFactory from slf4j):

import org.slf4j.LoggerFactory;
public class Foo {
  private static final Logger LOGGER =
    LoggerFactory.getLogger(Foo.class);
  public void save(String file) {
    // save the file
    if (Foo.LOGGER.isInfoEnabled()) {
      Foo.LOGGER.info("file {} saved successfuly", file);
    }
  }
}

What's wrong with it? Code duplication.

This static LOGGER property has to be declared in every class where logging is required. Just a few lines of code, but this is pure noise, as I see it.

badge

To make life easier, I created a library about two years ago, jcabi-log, which has a convenient utility class Logger (yes, I know that utility classes are evil).

import com.jcabi.log.Logger;
public class Foo {
  public void save(String file) {
    // save the file
    Logger.info(this, "file %s saved successfuly", file);
  }
}

This looks much cleaner to me and does exactly the same---sends a single log line to the SLF4J logging facility. Besides, it check automatically whether a given logging level is enabled (for performance optimization) and formats the given string using Formatter (same as String.format()).

For convenience, there are also a number of "decors" implemented in the library.

The library ships as a JAR dependency in Maven Central (get its latest versions in Maven Central):

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-log</artifactId>
</dependency>

© Yegor Bugayenko 2014–2018

MySQL Maven Plugin

QR code

MySQL Maven Plugin

I was using MySQL in a few Java web projects and found out there was no Maven plugin that would help me to test my DAOs against a real MySQL server. There are plenty of mechanisms to mock a database persistence layer both in memory and on disc. However, it is always good to make sure that your classes are tested against a database identical to the one you have in production environment.

badge

I've created my own Maven plugin, jcabi-mysql-maven-plugin, that does exactly two things: starts a MySQL server on pre-integration-test phase and shuts it down on post-integration-test.

This is how you configure it in pom.xml (see also its full usage instructions):

<project>
  <build>
    <plugins>
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>build-helper-maven-plugin</artifactId>
        <executions>
          <execution>
            <goals>
              <goal>reserve-network-port</goal>
            </goals>
            <configuration>
              <portNames>
                <portName>mysql.port</portName>
              </portNames>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <artifactId>maven-dependency-plugin</artifactId>
        <executions>
          <execution>
            <goals>
              <goal>unpack</goal>
            </goals>
            <configuration>
              <artifactItems>
                <artifactItem>
                  <groupId>com.jcabi</groupId>
                  <artifactId>mysql-dist</artifactId>
                  <version>5.6.14</version>
                  <classifier>${mysql.classifier}</classifier>
                  <type>zip</type>
                  <overWrite>false</overWrite>
                  <outputDirectory>
                    ${project.build.directory}/mysql-dist
                  </outputDirectory>
                </artifactItem>
              </artifactItems>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>com.jcabi</groupId>
        <artifactId>jcabi-mysql-maven-plugin</artifactId>
        <executions>
          <execution>
            <id>mysql-test</id>
            <goals>
              <goal>classify</goal>
              <goal>start</goal>
              <goal>stop</goal>
            </goals>
            <configuration>
              <port>${mysql.port}</port>
              <data>${project.build.directory}/mysql-data</data>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <artifactId>maven-failsafe-plugin</artifactId>
        <configuration>
          <systemPropertyVariables>
            <mysql.port>${mysql.port}</mysql.port>
          </systemPropertyVariables>
        </configuration>
        <executions>
          <execution>
            <goals>
              <goal>integration-test</goal>
              <goal>verify</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
  [...]
</project>

There are two plugins configured above. Let's take a look at what each does.

  1. build-helper-maven-plugin is reserving a temporary random TCP port, which will be used by MySQL server. We don't want to start a server on its default 3306 port, because there could be another server already running there. Besides that, if we use a hard-coded TCP port, we won't be able to run multiple builds in parallel. Maybe not a big deal when you're developing locally, but in continuous integration environment this can be a problem. That's why we're reserving a TCP port first.

  2. maven-dependency-plugin is downloading a MySQL distribution in a zip archive (rather big file, over 300Mb for Linux), and unpacks it. This archive contains exactly the same files as you would use for a traditional MySQL installation. When the archive is unpacked, it is ready to start serving SQL requests as a normal MySQL server.

  3. jcabi-mysql-maven-plugin starts a server, binding it to a TCP port reserved randomly. The main responsibility of my Maven plugin is to make sure that MySQL server starts correctly on every platform (Mac OS, Linux, Windows) and stops when it's not needed any more. All the rest is done by the MySQL distribution itself.

  4. maven-failsafe-plugin is running unit tests on integration-test phase. Its main difference from maven-surefire-plugin is that it doesn't fail a build when some tests fail. Instead, it saves all failures into supplementary files in target directory and allows the build continue. Later, when we call its verify goal, it will fail a build if there were any errors during its integration-test goal execution.

To be precise, this is the order in which Maven will execute configured goals:

jcabi-mysql-maven-plugin:classify
maven-dependency-plugin:unpack
build-helper-maven-plugin:reserve-network-port
jcabi-mysql-maven-plugin:start
maven-failsafe-plugin:integration-test
jcabi-mysql-maven-plugin:stop
maven-failsafe-plugin:verify

Run mvn clean install and see how it works. If it doesn't work for some reason, don't hesitate to report an issue to GitHub.

Now it's time to create an integration test, which will connect to the temporary MySQL server, create a table there and insert some data into it. This is just an example to show that MySQL server is running and is capable of serving transactions (I'm using jcabi-jdbc):

public class FooITCase {
  private static final String PORT = System.getProperty("mysql.port");
  @Test
  public void worksWithMysqlServer() {
    Connection conn = DriverManager.getConnection(
      String.format(
        "jdbc:mysql://localhost:%s/root?user=root&password=root",
        FooITCase.PORT
      )
    );
    new JdbcSession(conn)
      .sql("CREATE TABLE foo (id INT PRIMARY KEY)")
      .execute();
  }
}

If you're using Hibernate, just create a db.properties file in src/test/resources directory. In that file you would do something like:

hibernate.connection.url=jdbc:mysql://localhost:${mysql.port}/root
hibernate.connection.username=root
hibernate.connection.password=root

Maven will replace that ${mysql.port} with the number of reserved TCP port, during resources copying. This operation is called "resources filtering," and you can read about it here.

That's pretty much it. I'm using jcabi-mysql-maven-plugin in a few projects, and it helps me to stay confident that my code works with a real MySQL server. I'm also using the Liquibase Maven plugin in order to populate an empty server with tables required for the application. Nevertheless, that is a story for the next post :)

© Yegor Bugayenko 2014–2018

Atomic Counters at Stateful.co

QR code

Atomic Counters at Stateful.co

badge

Amazon DynamoDB is a great NoSQL cloud database. It is cheap, highly reliable and rather powerful. I'm using it in many web systems.

There is one feature that it lacks, though---auto-increment attributes.

Say that you have a table with a list of messages:

+------+----------------------------+
| id   | Attributes                 |
+------+----------------------------+
| 205  | author="jeff", text="..."  |
| 206  | author="bob", text="..."   |
| 207  | author="alice", text="..." |
+------+----------------------------+

Every time you add a new item to the table, a new value of id has to be set. And this has to be done with concurrency in mind. SQL databases like PostgreSQL, Oracle, MySQL and others support auto-increment features. When you add a new record to the table, the value of the primary key is omitted and the server retrieves the next one automatically. If a number of INSERT requests arrive at the same time the server guarantees that the numbers won't be duplicated.

However, DynamoDB doesn't have this feature. Instead, DynamoDB has Atomic Counters and Conditional Updates, which are very similar features. Still, they're not exactly the same.

In case of an atomic counter, you should create a supplementary table and keep the latest value of id in it.

In case of conditional updates, you should retry a few times in case of collisions.

badge

To make life easier in a few of my applications, I created a simple web service---stateful.co. It provides a simple atomic counter feature through its RESTful API.

First, you create a counter with a unique name. Then, you set its initial value (it is zero by default). And, that's it. Every time you need to obtain a new value for id column in DynamoDB table, you make an HTTP request to stateful.co asking to increment your counter by one and return its next value.

stateful.co guarantees that values returned will never duplicate each other---no matter how many clients are using a counter or how fast they request increments simultaneously.

Moreover, I designed a small Java SDK for stateful.co. All you need to do is add this java-sdk.jar Maven dependency to your project:

<dependency>
  <groupId>co.stateful</groupId>
  <artifactId>java-sdk</artifactId>
  <version>0.6</version>
</dependency>

And, you can use stateful.co counters from Java code:

Sttc sttc = new RtSttc(
  new URN("urn:github:526301"),
  "9FF3-41E0-73FB-F900"
);
Counters counters = sttc.counters();
Counter counter = counters.get("foo");
long value = counter.incrementAndGet(1L);
System.out.println("new value: " + value);

You can review authentication parameters for RtSttc constructor at stateful.co.

The service is absolutely free of charge.

© Yegor Bugayenko 2014–2018

Object-Oriented GitHub API

QR code

Object-Oriented GitHub API

badge

GitHub is an awesome platform for maintaining Git sources and tracking project issues. I moved all my projects (both private and public) to GitHub about three years ago and have no regrets. Moreover, GitHub gives access to almost all of its features through RESTful JSON API.

There are a few Java SDK-s that wrap and expose the API. I tried to use them, but faced a number of issues:

  • They are not really object-oriented (even though one of them has a description that says it is)
  • They are not based on JSR-353 (JSON Java API)
  • They provide no mocking instruments
  • They don't cover the entire API and can't be extended
badge

Keeping in mind all those drawbacks, I created my own library---jcabi-github. Let's look at its most important advantages.

Object Oriented for Real

GitHub server is an object. A collection of issues is an object, an individual issue is an object, its author is an author, etc. For example, to retrieve the name of the author we use:

GitHub github = new RtGitHub(/* credentials */);
Repos repos = github.repos();
Repo repo = repos.get(new Coordinates.Simple("jcabi/jcabi-github"));
Issues issues = github.issues();
Issue issue = issues.get(123);
User author = new Issue.Smart(issue).author();
System.out.println(author.name());

Needless to say, GitHub, Repos, Repo, Issues, Issue, and User are interfaces. Classes that implement them are not visible in the library.

Mock Engine

MkGitHub class is a mock version of a GitHub server. It behaves almost exactly the same as a real server and is the perfect instrument for unit testing. For example, say that you're testing a method that is supposed to post a new issue to GitHub and add a message into it. Here is how the unit test would look:

public class FooTest {
  @Test
  public void createsIssueAndPostsMessage() {
    GitHub github = new MkGitHub("jeff");
    github.repos().create(
      Json.createObjectBuilder().add("name", owner).build()
    );
    new Foo().doTheThing(github);
    MatcherAssert.assertThat(
      github.issues().get(1).comments().iterate(),
      Matchers.not(Matchers.emptyIterable())
    );
  }
}

This is much more convenient and compact than traditional mocking via Mockito or a similar framework.

Extensible

It is based on JSR-353 and uses jcabi-http for HTTP request processing. This combination makes it highly customizable and extensible, when some GitHub feature is not covered by the library (and there are many of them).

For example, you want to get the value of hireable attribute of a User. Class User.Smart doesn't have a method for it. So, here is how you would get it:

User user = // get it somewhere
// name() method exists in User.Smart, let's use it
System.out.println(new User.Smart(user).name());
// there is no hireable() method there
System.out.println(user.json().getString("hireable"));

We're using method json() that returns an instance of JsonObject from JSR-353 (part of Java7).

No other library allows such direct access to JSON objects returned by the GitHub server.

Let's see another example. Say, you want to use some feature from GitHub that is not covered by the API. You get a Request object from GitHub interface and directly access the HTTP entry point of the server:

GitHub github = new RtGitHub(oauthKey);
int found = github.entry()
  .uri().path("/search/repositories").back()
  .method(Request.GET)
  .as(JsonResponse.class)
  .getJsonObject()
  .getNumber("total_count")
  .intValue();

jcabi-http HTTP client is used by jcabi-github.

Immutable

All classes are truly immutable and annotated with @Immutable. This may sound like a minor benefit, but it was very important for me. I'm using this annotation in all my projects to ensure my classes are truly immutable.

Version 0.8

A few days ago we released the latest version 0.8. It is a major release, that included over 1200 commits. It covers the entire GitHub API and is supposed to be very stable. The library ships as a JAR dependency in Maven Central (get its latest versions in Maven Central):

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-github</artifactId>
</dependency>

© Yegor Bugayenko 2014–2018

Why NULL is Bad?

QR code

Why NULL is Bad?

A simple example of NULL usage in Java:

public Employee getByName(String name) {
  int id = database.find(name);
  if (id == 0) {
    return null;
  }
  return new Employee(id);
}

What is wrong with this method?

It may return NULL instead of an object---that's what is wrong. NULL is a terrible practice in an object-oriented paradigm and should be avoided at all costs. There have been a number of opinions about this published already, including Null References, The Billion Dollar Mistake presentation by Tony Hoare and the entire Object Thinking book by David West.

Here, I'll try to summarize all the arguments and show examples of how NULL usage can be avoided and replaced with proper object-oriented constructs.

Basically, there are two possible alternatives to NULL.

The first one is Null Object design pattern (the best way is to make it a constant):

public Employee getByName(String name) {
  int id = database.find(name);
  if (id == 0) {
    return Employee.NOBODY;
  }
  return Employee(id);
}

The second possible alternative is to fail fast by throwing an Exception when you can't return an object:

public Employee getByName(String name) {
  int id = database.find(name);
  if (id == 0) {
    throw new EmployeeNotFoundException(name);
  }
  return Employee(id);
}

Now, let's see the arguments against NULL.

Besides Tony Hoare's presentation and David West's book mentioned above, I read these publications before writing this post: Clean Code by Robert Martin, Code Complete by Steve McConnell, Say "No" to "Null" by John Sonmez, Is returning null bad design? discussion at StackOverflow.

Ad-hoc Error Handling

Every time you get an object as an input you must check whether it is NULL or a valid object reference. If you forget to check, a NullPointerException (NPE) may break execution in runtime. Thus, your logic becomes polluted with multiple checks and if/then/else forks:

// this is a terrible design, don't reuse
Employee employee = dept.getByName("Jeffrey");
if (employee == null) {
  System.out.println("can't find an employee");
  System.exit(-1);
} else {
  employee.transferTo(dept2);
}

This is how exceptional situations are supposed to be handled in C and other imperative procedural languages. OOP introduced exception handling primarily to get rid of these ad-hoc error handling blocks. In OOP, we let exceptions bubble up until they reach an application-wide error handler and our code becomes much cleaner and shorter:

dept.getByName("Jeffrey").transferTo(dept2);

Consider NULL references an inheritance of procedural programming, and use 1) Null Objects or 2) Exceptions instead.

Ambiguous Semantic

In order to explicitly convey its meaning, the function getByName() has to be named getByNameOrNullIfNotFound(). The same should happen with every function that returns an object or NULL. Otherwise, ambiguity is inevitable for a code reader. Thus, to keep semantic unambiguous, you should give longer names to functions.

To get rid of this ambiguity, always return a real object, a null object or throw an exception.

Some may argue that we sometimes have to return NULL, for the sake of performance. For example, method get() of interface Map in Java returns NULL when there is no such item in the map:

Employee employee = employees.get("Jeffrey");
if (employee == null) {
  throw new EmployeeNotFoundException();
}
return employee;

This code searches the map only once due to the usage of NULL in Map. If we would refactor Map so that its method get() will throw an exception if nothing is found, our code will look like this:

if (!employees.containsKey("Jeffrey")) { // first search
  throw new EmployeeNotFoundException();
}
return employees.get("Jeffrey"); // second search

Obviously, this is method is twice as slow as the first one. What to do?

The Map interface (no offense to its authors) has a design flaw. Its method get() should have been returning an Iterator so that our code would look like:

Iterator found = Map.search("Jeffrey");
if (!found.hasNext()) {
  throw new EmployeeNotFoundException();
}
return found.next();

BTW, that is exactly how C++ STL map::find() method is designed.

Computer Thinking vs. Object Thinking

Statement if (employee == null) is understood by someone who knows that an object in Java is a pointer to a data structure and that NULL is a pointer to nothing (0x00000000, in Intel x86 processors).

However, if you start thinking as an object, this statement makes much less sense. This is how our code looks from an object point of view:

- Hello, is it a software department?
- Yes.
- Let me talk to your employee "Jeffrey" please.
- Hold the line please...
- Hello.
- Are you NULL?

The last question in this conversation sounds weird, doesn't it?

Instead, if they hang up the phone after our request to speak to Jeffrey, that causes a problem for us (Exception). At that point, we try to call again or inform our supervisor that we can't reach Jeffrey and complete a bigger transaction.

Alternatively, they may let us speak to another person, who is not Jeffrey, but who can help with most of our questions or refuse to help if we need something "Jeffrey specific" (Null Object).

Slow Failing

Instead of failing fast, the code above attempts to die slowly, killing others on its way. Instead of letting everyone know that something went wrong and that an exception handling should start immediately, it is hiding this failure from its client.

This argument is close to the "ad-hoc error handling" discussed above.

It is a good practice to make your code as fragile as possible, letting it break when necessary.

Make your methods extremely demanding as to the data they manipulate. Let them complain by throwing exceptions, if the provided data provided is not sufficient or simply doesn't fit with the main usage scenario of the method.

Otherwise, return a Null Object, that exposes some common behavior and throws exceptions on all other calls:

public Employee getByName(String name) {
  int id = database.find(name);
  Employee employee;
  if (id == 0) {
    employee = new Employee() {
      @Override
      public String name() {
        return "anonymous";
      }
      @Override
      public void transferTo(Department dept) {
        throw new AnonymousEmployeeException(
          "I can't be transferred, I'm anonymous"
        );
      }
    };
  } else {
    employee = Employee(id);
  }
  return employee;
}

Mutable and Incomplete Objects

In general, it is highly recommended to design objects with immutability in mind. This means that an object gets all necessary knowledge during its instantiating and never changes its state during the entire life-cycle.

Very often, NULL values are used in lazy loading, to make objects incomplete and mutable. For example:

public class Department {
  private Employee found = null;
  public synchronized Employee manager() {
    if (this.found == null) {
      this.found = new Employee("Jeffrey");
    }
    return this.found;
  }
}

This technology, although widely used, is an anti-pattern in OOP. Mostly because it makes an object responsible for performance problems of the computational platform, which is something an Employee object should not be aware of.

Instead of managing a state and exposing its business-relevant behavior, an object has to take care of the caching of its own results---this is what lazy loading is about.

Caching is not something an employee does in the office, does he?

The solution? Don't use lazy loading in such a primitive way, as in the example above. Instead, move this caching problem to another layer of your application.

For example, in Java, you can use aspect-oriented programming aspects. For example, jcabi-aspects has @Cacheable annotation that caches the value returned by a method:

import com.jcabi.aspects.Cacheable;
public class Department {
  @Cacheable(forever = true)
  public Employee manager() {
    return new Employee("Jacky Brown");
  }
}

I hope this analysis was convincing enough that you will stop NULL-ing your code :)

© Yegor Bugayenko 2014–2018

OOP Alternative to Utility Classes

QR code

OOP Alternative to Utility Classes

A utility class (aka helper class) is a "structure" that has only static methods and encapsulates no state. StringUtils, IOUtils, FileUtils from Apache Commons; Iterables and Iterators from Guava, and Files from JDK7 are perfect examples of utility classes.

This design idea is very popular in the Java world (as well as C#, Ruby, etc.) because utility classes provide common functionality used everywhere.

Here, we want to follow the DRY principle and avoid duplication. Therefore, we place common code blocks into utility classes and reuse them when necessary:

// This is a terrible design, don't reuse
public class NumberUtils {
  public static int max(int a, int b) {
    return a > b ? a : b;
  }
}

Indeed, this a very convenient technique!?

Utility Classes Are Evil

However, in an object-oriented world, utility classes are considered a very bad (some even may say "terrible") practice.

There have been many discussions of this subject; to name a few: Are Helper Classes Evil? by Nick Malik, Why helper, singletons and utility classes are mostly bad by Simon Hart, Avoiding Utility Classes by Marshal Ward, Kill That Util Class! by Dhaval Dalal, Helper Classes Are A Code Smell by Rob Bagby.

Additionally, there are a few questions on StackExchange about utility classes: If a “Utilities” class is evil, where do I put my generic code?, Utility Classes are Evil.

A dry summary of all their arguments is that utility classes are not proper objects; therefore, they don't fit into object-oriented world. They were inherited from procedural programming, mostly because we were used to a functional decomposition paradigm back then.

Assuming you agree with the arguments and want to stop using utility classes, I'll show by example how these creatures can be replaced with proper objects.

Procedural Example

Say, for instance, you want to read a text file, split it into lines, trim every line and then save the results in another file. This is can be done with FileUtils from Apache Commons:

void transform(File in, File out) {
  Collection<String> src = FileUtils.readLines(in, "UTF-8");
  Collection<String> dest = new ArrayList<>(src.size());
  for (String line : src) {
    dest.add(line.trim());
  }
  FileUtils.writeLines(out, dest, "UTF-8");
}

The above code may look clean; however, this is procedural programming, not object-oriented. We are manipulating data (bytes and bits) and explicitly instructing the computer from where to retrieve them and then where to put them on every single line of code. We're defining a procedure of execution.

Object-Oriented Alternative

In an object-oriented paradigm, we should instantiate and compose objects, thus letting them manage data when and how they desire. Instead of calling supplementary static functions, we should create objects that are capable of exposing the behavior we are seeking:

public class Max implements Number {
  private final int a;
  private final int b;
  public Max(int x, int y) {
    this.a = x;
    this.b = y;
  }
  @Override
  public int intValue() {
    return this.a > this.b ? this.a : this.b;
  }
}

This procedural call:

int max = NumberUtils.max(10, 5);

Will become object-oriented:

int max = new Max(10, 5).intValue();

Potato, potato? Not really; just read on...

Objects Instead of Data Structures

This is how I would design the same file-transforming functionality as above but in an object-oriented manner:

void transform(File in, File out) {
  Collection<String> src = new Trimmed(
    new FileLines(new UnicodeFile(in))
  );
  Collection<String> dest = new FileLines(
    new UnicodeFile(out)
  );
  dest.addAll(src);
}

FileLines implements Collection<String> and encapsulates all file reading and writing operations. An instance of FileLines behaves exactly as a collection of strings and hides all I/O operations. When we iterate it---a file is being read. When we addAll() to it---a file is being written.

Trimmed also implements Collection<String> and encapsulates a collection of strings (Decorator pattern). Every time the next line is retrieved, it gets trimmed.

All classes taking participation in the snippet are rather small: Trimmed, FileLines, and UnicodeFile. Each of them is responsible for its own single feature, thus following perfectly the single responsibility principle.

On our side, as users of the library, this may be not so important, but for their developers it is an imperative. It is much easier to develop, maintain and unit-test class FileLines rather than using a readLines() method in a 80+ methods and 3000 lines utility class FileUtils. Seriously, look at its source code.

An object-oriented approach enables lazy execution. The in file is not read until its data is required. If we fail to open out due to some I/O error, the first file won't even be touched. The whole show starts only after we call addAll().

All lines in the second snippet, except the last one, instantiate and compose smaller objects into bigger ones. This object composition is rather cheap for the CPU since it doesn't cause any data transformations.

Besides that, it is obvious that the second script runs in O(1) space, while the first one executes in O(n). This is the consequence of our procedural approach to data in the first script.

In an object-oriented world, there is no data; there are only objects and their behavior!

© Yegor Bugayenko 2014–2018

DynamoDB Local Maven Plugin

QR code

DynamoDB Local Maven Plugin

badge

DynamoDB Local is a locally running copy of Amazon DynamoDB server. Amazon developed the tool and based it on SQLite. It acts as a real DynamoDB service through the RESTful API.

I guess, DynamoDB Local is meant to be used in integration testing and this is how we're going to use it below.

I use Maven to run all of my Java integration testing using maven-failsafe-plugin. The philosophy of integration testing with Maven is that you start all your supplementary test stubs during the pre-integration-test phase, run your tests in the integration-test phase and then shutdown all stubs during the post-integration-test.

badge

It would be great if it were possible to use DynamoDB Local that way. I didn't find any Maven plugins for that purpose, so I decided to create my own---jcabi-dynamodb-maven-plugin.

Full usage details for the plugin are explained on its website. However, here is a simple example (get its latest versions in Maven Central):

<plugin>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-dynamodb-maven-plugin</artifactId>
  <executions>
    <execution>
      <goals>
        <goal>start</goal>
        <goal>stop</goal>
      </goals>
      <configuration>
        <port>10500</port>
        <dist>${project.build.directory}/dynamodb-dist</dist>
      </configuration>
    </execution>
  </executions>
</plugin>

The above configuration will start DynamoDB Local right before running integration tests, and then stop it immediately afterwards. The server will listen at TCP port 10500. While the number is used in the example, you're supposed to use a randomly allocated port instead.

When the DynamoDB Local server is up and running, we can create an integration test for it:

import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient;
import com.amazonaws.services.dynamodbv2.model.ListTablesResult;
public class FooITCase {
  @Test
  public void worksWithAwsDynamoDb() {
    AmazonDynamoDB aws = new AmazonDynamoDBClient(
      new BasicAWSCredentials("", "")
    );
    aws.setEndpoint("http://localhost:10500");
    ListTablesResult list = aws.listTables();
    for (String name : list.getTableNames()) {
      System.out.println("table found: " + name);
    }
  }
}

Of course, there won't be any output because the server starts without any tables. Since the server is empty, you should create tables before every integration test, using createTable() from DynamoDB SDK.

To avoid this type of extra hassle, in the latest version 0.6 of jcabi-dynamodb-maven-plugin we introduced a new goal create-tables:

<plugin>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-dynamodb-maven-plugin</artifactId>
  <executions>
    <execution>
      <goals>
        <goal>create-tables</goal>
      </goals>
      <configuration>
        <tables>
          <table>${basedir}/src/test/dynamodb/foo.json</table>
        </tables>
      </configuration>
    </execution>
  </executions>
</plugin>

The foo.json file used above should contain a JSON request that is sent to DynamoDB Local right after it is up and running. The request should comply with the specification of CreateTable request. For example:

{
  "AttributeDefinitions": [
    {
      "AttributeName": "id",
      "AttributeType": "N"
    }
  ],
  "KeySchema": [
    {
      "AttributeName": "id",
      "KeyType": "HASH"
    }
  ],
  "ProvisionedThroughput": {
    "ReadCapacityUnits": "1",
    "WriteCapacityUnits": "1"
  },
  "TableName": "foo"
}

The table will be created during the pre-integration-test phase and dropped at the post-integration-test phase. Now, we can make our integration test much more meaningful with the help of jcabi-dynamo:

import com.jcabi.dynamo.Attributes;
import com.jcabi.dynamo.Conditions;
import com.jcabi.dynamo.Credentials;
import com.jcabi.dynamo.Region;
import com.jcabi.dynamo.Table;
import org.hamcrest.MatcherAssert;
import org.hamcrest.Matchers;
public class FooITCase {
  @Test
  public void worksWithAwsDynamoDb() {
    Region region = new Region.Simple(new Credentials.Simple("", ""));
    Table table = region.table("foo");
    table.put(
      new Attributes()
        .with("id", 123)
        .with("name", "Robert DeNiro")
    );
    MatcherAssert.assertThat(
      table.frame().where("id", Conditions.equalTo(123)),
      Matchers.notEmpty()
    );
  }
}

The above test will put a new item into the table and then assert that the item is there.

The plugin was tested with three operating systems, and proved to work without problems: Mac OS X 10.8.5, Windows 7 SP1 and Ubuntu Linux 12.04 Desktop.

© Yegor Bugayenko 2014–2018

W3C Java Validators

QR code

W3C Java Validators

  • comments
badge

A few years ago, I created two Java wrappers for W3C validators: (HTML and CSS). Both wrappers seemed to be working fine and were even listed by W3C on their website in the API section. Until recently, these wrappers have always been part of ReXSL library.

A few days ago, though, I took the wrappers out of ReXSL and published them as a standalone library---jcabi-w3c. Consequently, now seems to be a good time to write a few words about them.

Below is an example that demonstrates how you can validate an HTML document against W3C compliance rules:

import com.jcabi.w3c.ValidatorBuilder;
assert ValidatorBuilder.html()
  .validate("<html>hello, world!</html>")
  .valid();

The valid() method is a black or white indicator that returns false when the document is not valid. Additionally, you can obtain more information through a list of "defects" returned by the W3C server:

Collection<Defect> defects = ValidatorBuilder.html()
  .validate("<html>hello, world!</html>")
  .errors();

The same can be done with CSS:

Collection<Defect> defects = ValidatorBuilder.css()
  .validate("body { font-family: Arial; }")
  .errors();

Personally, I think it is a good practice to validate all of HTML pages produced by your application against W3C during integration testing. It's not a matter of seeking perfection, but rather of preventing bigger problems later.

These dependencies are mandatory when using jcabi-w3c (get their latest versions in Maven Central):

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-w3c</artifactId>
</dependency>
<dependency>
  <groupId>org.glassfish</groupId>
  <artifactId>javax.json</artifactId>
</dependency>
<dependency>
  <groupId>com.sun.jersey</groupId>
  <artifactId>jersey-client</artifactId>
</dependency>
<dependency>
  <groupId>org.hamcrest</groupId>
  <artifactId>hamcrest-core</artifactId>
</dependency>

© Yegor Bugayenko 2014–2018

XML/XPath Matchers for Hamcrest

QR code

XML/XPath Matchers for Hamcrest

  • comments
badge

Hamcrest is my favorite instrument in unit testing. It replaces the JUnit procedural assertions of org.junit.Assert with an object-oriented mechanism. However, I will discuss that subject in more detail sometime later.

Now, though, I want to demonstrate a new library published today on GitHub and Maven Central: jcabi-matchers. jcabi-matchers is a collection of Hamcrest matchers to make XPath assertions in XML and XHTML documents.

Let's say, for instance, a class that is undergoing testing produces an XML that needs to contain a single <message> element with the content "hello, world!"

This is how that code would look in a unit test:

import com.jcabi.matchers.XhtmlMatchers;
import org.hamcrest.MatcherAssert;
import org.junit.Test;
public class FooTest {
  @Test
  public void hasWelcomeMessage() {
    MatcherAssert.assertThat(
      new Foo().createXml(),
      XhtmlMatchers.hasXPaths(
        "/document[count(message)=1]",
        "/document/message[.='hello, world!']"
      )
    );
  }
}

There are two alternatives to the above that I'm aware of, which are do almost the same thing: xml-matchers by David Ehringer and hasXPath() method in Hamcrest itself.

I have tried them both, but faced a number of problems.

First, Hamcrest hasXPath() works only with an instance of Node. With this method, converting a String into Node becomes a repetitive and routine task in every unit test.

The above is a very strange limitation of Hamcrest in contrast to jcabi-matchers, which works with almost anything, from a String to a Reader and even an InputStream.

Second, XmlMatchers from xml-matchers provides a very inconvenient way for working with namespaces. Before you can use an XPath query with a non-default namespace, you should create an instance of NamespaceContext.

The library provides a simple implementation of this interface, but, still, it is requires extra code in every unit test.

jcabi-matchers simplifies namespace handling problems even further, as it pre-defines most popular namespaces, including xhtml, xs, xsl, etc.

The following example works right out-of-the-box---without any extra configuration:

MatcherAssert.assertThat(
  new URL("http://www.google.com").getContent(),
  XhtmlMatchers.hasXPath("//xhtml:body")
);

To summarize, my primary objective with the library was its simplicity of usage.

© Yegor Bugayenko 2014–2018

Typical Mistakes in Java Code

QR code

Typical Mistakes in Java Code

  • comments

This page contains most typical mistakes I see in the Java code of people working with me. Static analysis (we're using qulice can't catch all of the mistakes for obvious reasons, and that's why I decided to list them all here.

Let me know if you want to see something else added here, and I'll be happy to oblige.

All of the listed mistakes are related to object-oriented programming in general and to Java in particular.

Class Names

Your class should be an abstraction of a real life entity with no "validators," "controllers," "managers," etc. If your class name ends with an "-er"---it's a bad design. BTW, here are my seven virtues of a good object. Also, this post explains this idea in more details: Don't Create Objects That End With -ER.

And, of course, utility classes are anti-patterns, like StringUtils, FileUtils, and IOUtils from Apache. The above are perfect examples of terrible designs. Read this follow up post: OOP Alternative to Utility Classes

Of course, never add suffixes or prefixes to distinguish between interfaces and classes. For example, all of these names are terribly wrong: IRecord, IfaceEmployee, or RecordInterface. Usually, interface name is the name of a real-life entity, while class name should explain its implementation details. If there is nothing specific to say about an implementation, name it Default, Simple, or something similar. For example:

class SimpleUser implements User {};
class DefaultRecord implements Record {};
class Suffixed implements Name {};
class Validated implements Content {};

Method Names

Methods can either return something or return void. If a method returns something, then its name should explain what it returns, for example (don't use the get prefix ever):

boolean isValid(String name);
String content();
int ageOf(File file);

If it returns void, then its name should explain what it does. For example:

void save(File file);
void process(Work work);
void append(File file, String line);

You can read more about this idea in Elegant Objects book, section 2.4. There is only one exception to the rule just mentioned---test methods for JUnit. They are explained below.

Test Method Names

Method names in JUnit tests should be created as English sentences without spaces. It's easier to explain by example:

/**
 * HttpRequest can return its content in Unicode.
 * @throws Exception If test fails
 */
@Test
public void returnsItsContentInUnicode() throws Exception {
}

It's important to start the first sentence of your Javadoc with the name of the class you're testing followed by can (or cannot). So, your first sentence should always be similar to "somebody can do something."

The method name will state exactly the same, but without the subject. If I add a subject at the beginning of the method name, I should get a complete English sentence, as in above example: "HttpRequest returns its content in Unicode."

Pay attention that the test method doesn't start with can. Only Javadoc comments use 'can.'

It's a good practice to always declare test methods as throwing Exception.

Variable Names

Avoid composite names of variables, like timeOfDay, firstItem, or httpRequest. I mean with both---class variables and in-method ones. A variable name should be long enough to avoid ambiguity in its scope of visibility, but not too long if possible. A name should be a noun in singular or plural form, or an appropriate abbreviation. More about it in this post: A Compound Name Is a Code Smell. For example:

List<String> names;
void sendThroughProxy(File file, Protocol proto);
private File content;
public HttpRequest request;

Sometimes, you may have collisions between constructor parameters and in-class properties if the constructor saves incoming data in an instantiated object. In this case, I recommend to create abbreviations by removing vowels (see how USPS abbreviates street names).

Another example:

public class Message {
  private String recipient;
  public Message(String rcpt) {
    this.recipient = rcpt;
  }
}

In many cases, the best hint for a name of a variable can ascertained by reading its class name. Just write it with a small letter, and you should be good:

File file;
User user;
Branch branch;

However, never do the same for primitive types, like Integer number or String string.

You can also use an adjective, when there are multiple variables with different characteristics. For instance:

String contact(String left, String right);

Constructors

Without exceptions, there should be only one constructor that stores data in object variables. All other constructors should call this one with different arguments. For example:

public class Server {
  private String address;
  public Server(String uri) {
    this.address = uri;
  }
  public Server(URI uri) {
    this(uri.toString());
  }
}

More about it in There Can Be Only One Primary Constructor.

One-time Variables

Avoid one-time variables at all costs. By "one-time" I mean variables that are used only once. Like in this example:

String name = "data.txt";
return new File(name);

This above variable is used only once and the code should be refactored to:

return new File("data.txt");

Sometimes, in very rare cases---mostly because of better formatting---one-time variables may be used. Nevertheless, try to avoid such situations at all costs.

Exceptions

Needless to say, you should never swallow exceptions, but rather let them bubble up as high as possible. Private methods should always let checked exceptions go out.

Never use exceptions for flow control. For example this code is wrong:

int size;
try {
  size = this.fileSize();
} catch (IOException ex) {
  size = 0;
}

Seriously, what if that IOException says "disk is full?" Will you still assume that the size of the file is zero and move on?

Indentation

For indentation, the main rule is that a bracket should either end a line or be closed on the same line (reverse rule applies to a closing bracket). For example, the following is not correct because the first bracket is not closed on the same line and there are symbols after it. The second bracket is also in trouble because there are symbols in front of it and it is not opened on the same line:

final File file = new File(directory,
  "file.txt");

Correct indentation should look like:

StringUtils.join(
  Arrays.asList(
    "first line",
    "second line",
    StringUtils.join(
      Arrays.asList("a", "b")
    )
  ),
  "separator"
);

The second important rule of indentation says that you should put as much as possible on one line - within the limit of 80 characters. The example above is not valid since it can be compacted:

StringUtils.join(
  Arrays.asList(
    "first line", "second line",
    StringUtils.join(Arrays.asList("a", "b"))
  ),
  "separator"
);

Redundant Constants

Class constants should be used when you want to share information between class methods, and this information is a characteristic (!) of your class. Don't use constants as a replacement of string or numeric literals---very bad practice that leads to code pollution. Constants (as with any object in OOP) should have a meaning in a real world. What meaning do these constants have in the real world:

class Document {
  private static final String D_LETTER = "D"; // bad practice
  private static final String EXTENSION = ".doc"; // good practice
}

Another typical mistake is to use constants in unit tests to avoid duplicate string/numeric literals in test methods. Don't do this! Every test method should work with its own set of input values.

Use new texts and numbers in every new test method. They are independent. So, why do they have to share the same input constants?

Test Data Coupling

This is an example of data coupling in a test method:

User user = new User("Jeff");
// maybe some other code here
MatcherAssert.assertThat(user.name(), Matchers.equalTo("Jeff"));

On the last line, we couple "Jeff" with the same string literal from the first line. If, a few months later, someone wants to change the value on the third line, he/she has to spend extra time finding where else "Jeff" is used in the same method.

To avoid this data coupling, you should introduce a variable. More about it here: A Few Thoughts on Unit Test Scaffolding.

© Yegor Bugayenko 2014–2018

Incremental Requirements With Requs

QR code

Incremental Requirements With Requs

  • comments

Requirements engineering is one of the most important disciplines in software development. Perhaps, even more important than architecture, design or coding itself.

Joy Beatty and Karl Wiegers in Software Requirements argue that the cost of mistakes made in a requirements specification is significantly higher than a bug in source code. I totally agree.

In XDSD projects we specify requirements using Requs, a controlled natural language that sounds like English, while at the same time is parseable by computers. A simple requirements document in Requs may look similar to:

Department has employee-s.
Employee has name and salary.
UC1 where Employee gets raise: "TBD."

This Software Requirements Specification (SRS) defines two types (Department and Employee) and one method UC (aka "use case").

Requs syntax is explained here.

The main and only goal of requirements engineering in any XDSD project is to create a complete and non-ambiguous SRS document. The person who performs this task is called the "system analyst." This article explains his or her main tasks and discusses possible pitfalls.

Tasks

We modify SRS incrementally, and our increments are very small. For instance, say we have the sample document I mentioned above, and I'm a system analyst on the project. All my tasks will be similar to "there is a bug in SRS, let's fix it."

Even if it is a suggestion, it will still start with a complaint about the incompleteness of the SRS. For example:

  • UC1 doesn't explain how exactly an employee receives a raise.
  • Does the salary of an employee have limits? Can it be negative?
  • How many employees can a department have? Can it be zero?
  • Can an employee receive a decrease in salary?

All of these bugs are addressed to me. I need to fix them by improving the SRS. My workflow is the same in every task:

  1. Understand what is required
  2. Change the SRS
  3. Close the task

Let's try this step by step.

Requirements Providers

As a system analyst, my job is to understand what product owners (aka "requirements providers") want and document their wishes. In most cases, their wants and wishes are very vague and chaotic. My job is to make them complete and unambiguous. That's why the first step is to understand what is required.

First of all, I must determine who the product owner is before I can begin. The product owner signs the SRS, so I should pay complete attention to his opinions. However, my job is not only to listen, but also to suggest. A good system analyst can provoke creative thinking in a product owner by asking the right questions.

OK, now I that know the identity of the product owner, I need to talk to him. In XDSD, we don't do any meetings, phone calls, or any other type of informal communications. Therefore, my only mechanism for receiving the information I need is with is---tickets.

I will submit new tickets, addressing them to the product owner. As there can be many product owners in a project, I must submit tickets that clearly state in the first sentence that the ticket pertains to questions for a particular owners. The person receiving the ticket will then determine the best person to answer it.

Thus, while working with a single task, I will submit many questions and receive many interesting answers. I'll do all this in order to improve my understanding of the product the owners are developing.

When I understand how the SRS should be fixed, it is time to make changes in the Requs files.

Requs Files

The SRS document is generated automatically on every continuous integration build cycle. It is compiled from pieces called .req files, which are usually located in the src/main/requs directory in a project repository.

My job, as a system analyst, is to make changes to some of these files and submit a pull request for review.

GitHub Guidelines explains [how to work with GitHub. However, in short, I need to:

  • Clone the repository;
  • Check out its copy to my computer;
  • Make changes;
  • Commit my changes;
  • Push them to my remote fork;
  • Submit a pull request

It doesn't really matter which files I edit because Requs automatically composes together all files with the req extension. I can even add new files to the directory---they will be picked up. Likewise, I can also add sub- directories with files.

Local Build

Before submitting a pull request, I will try to validate that my changes are syntactically and grammatically valid. I will compile Requs files into the SRS document using the same method our continuous integration server uses to compile them.

Before I can compile, though, I need to install JDK7 and Maven.

Afterwards, I make the following command line call in the project directory:

mvn clean requs:compile

After entering the commands, I expect to see the BUILD SUCCESS message. If not, there are some errors and I should fix them. My pull request won't be merged and I won't be able to close the task if Requs can't compile the files.

Once compiled, I can open the SRS in Firefox. It is in target/requs/index.xml. Even though it is an XML file, Firefox can open it as a webpage. Other browsers won't work. Well, Google Chrome will work, but only with this small trick.

Pull Request Review

Once all changes are finished, I will submit a pull request. A project manager will the assign someone to review my pull request and I will receive feedback.

In most cases, there will be at least a few corrections requested by the reviewer. Generally speaking, my requests are reviewed by other system analysts. Therefore, I must address all comments and make sure my changes satisfy the reviewer.

I will make extra changes to the same branch locally, and push them to GitHub. The pull request will be updated automatically, so I don't need to create a new one.

Once the pull request is clean enough for the reviewer, he will merge it into the master branch.

Close and Get Paid

Finally, my pull request is merged and I get back to the task owner. I tell him that the SRS was fixed and request that he review it. His original problem should be fixed by now---the SRS should provide the information required.

He then closes the task and the project manager pays me within a few hours.

PS. Also, check this article about a custom lexer for Jekyll, which I created in order to highlight Requs syntax in this blog post.

© Yegor Bugayenko 2014–2018

Java XML Parsing Made Easy

QR code

Java XML Parsing Made Easy

  • comments
badge

Unlike with many other modern languages, parsing XML in Java requires more than one line of code. XML traversing using XPath takes even more code, and I find this is unfair and annoying.

I'm a big fan of XML and use it it in almost every Java application. Some time ago, I decided to put all of that XML-to-DOM parsing code into a small library---jcabi-xml.

Put simply, the library is a convenient wrapper for JDK-native DOM manipulations. That's why it is small and dependency-free. With the following example, you can see just how simple XML parsing can be:

import com.jcabi.xml.XML;
import com.jcabi.xml.XMLDocument;
XML xml = new XMLDocument(
  "<root><a>hello</a><b>world!</b></root>"
);

Now, we have an object of interface XML that can traverse the XML tree and convert it back to text.

For example:

// outputs "hello"
System.out.println(xml.xpath("/root/a/text()").get(0));
// outputs the entire XML document
System.out.println(xml.toString());

Method xpath() allows you to find a collection of text nodes or attributes in the document, and then convert them to a collection of strings, using XPath query:

// outputs "hello" and "world"
for (String text : xml.xpath("/root/*/text()")) {
  System.out.println(text);
}

Method nodes() enables the same XPath search operation, but instead returns a collection of instances of XML interface:

// outputs "<a>hello</a>" and "<b>world</b>"
for (XML node : xml.nodes("/root/*"))
  System.out.println(node);
}

Besides XML parsing, printing and XPath traversing, jcabi-xml also provides XSD validation and XSL transformations. I'll write about those features in the next post :)

PS. Also, check this: XML/XPath Matchers for Hamcrest.

© Yegor Bugayenko 2014–2018

Basic HTTP Auth for S3 Buckets

QR code

Basic HTTP Auth for S3 Buckets

  • comments
badge

Amazon S3 is a simple and very useful storage of binary objects (aka "files"). To use it, you create a "bucket" there with a unique name and upload your objects.

Afterwards, AWS guarantees your object will be available for download through their RESTful API.

A few years ago, AWS introduced a S3 feature called static website hosting.

With static website hosting, you simply turn on the feature and all objects in your bucket become available through public HTTP. This is an awesome feature for hosting static content, such as images, JavaScript files, video and audio content.

When using the hosting, you need to change the CNAME record in your DNS so that it points to www.example.com.aws.amazon.com. After changing the DNS entry, your static website is available at www.example.com just as it would be normally.

When using Amazon S3, though, it is not possible to protect your website because the content is purely static. This means you can't have a login page on the front end. With the service, you can either make your objects either absolutely public---so that anyone can see them online---or assign access rights to them---but only for users connected through RESTful API.

My use case with the service was a bit more complex, though. I wanted to host my static content as S3 objects. However, I wanted to do this while ensuring only a few people had access to the content using their Web browsers.

HTTP Basic Authentication

The HTTP protocol offers a nice "basic access authentication" feature that doesn't require any extra site pages.

When an HTTP request arrives at the server, it doesn't deliver the content but replies with a 401 status response. This response means literally "I don't know who you are, please authenticate yourself."

The browser shows its native login screen and prompts for a user name and password. After entering the login credentials, they are concatenated, Base64 encoded, and added to the next request in Authorization HTTP header.

The figure

Now, the browser tries to make another attempt to fetch the same webpage. But, this time, the HTTP request contains a header:

Authorization: Basic am9lOnNlY3JldA==

The above is just an example. In the example, the Base64 encoded part means joe:secret, where joe is the user name and secret the password entered by the user.

This time the server has authentication information and can make a decision whether this user is authenticated (his password matches the server's records) and authorized (he has permission to access the request webpage).

s3auth.com

Since Amazon doesn't provide this feature, I decided to create a simple web service, s3auth.com, which stays in front of my Amazon S3 buckets and implements the HTTP-native authentication and authorization mechanism.

Instead of making my objects public, though, I make them private and point my CNAME record to relay.s3auth.com. HTTP requests from Web browsers then arrive at my server, connect to Amazon S3, retrieve my objects and deliver them back in HTTP responses.

The server implements authentication and authorization using a special file .htpasswd in the root of my bucket. The format of the .htpasswd file is identical to the one used by Apache HTTP Server---one user per line. Every line has the name of a user and a hash version of his password.

Implementation

I made this software open source mostly to guarantee to my users that the server doesn't store their private data anywhere, but rather acts only as a pass-through service. As a result, the software is on GitHub.

For the sake of privacy and convenience, I use only OAuth2 for user accounts. This means that I don't know who my users are. I don't possess their names or emails, but only their account numbers in Facebook, Google Plus or GitHub. Of course, I can find their names using these numbers, but this information is public anyway.

The server is implemented in Java6. For its hosting, I'm using a single Amazon EC2 m1.small Ubuntu server. These days, the server seems to work properly and is stable.

Extra Features

Besides authentication and authorization, the s3auth.com server can render lists of pages---just like Apache HTTP Server. If you have a collection of objects in your bucket---but the index.html file is missing---Amazon S3 delivers a "page not found" result. Conversely, my server displays a list of objects in the bucket, when no index.html is present, and makes it possible to navigate up or down one folder.

When your bucket has the versioning feature turned on, you are able to list all versions of any object in the browser. To do this, just add ?all-versions to the end of the URL to display the list. Next, click a version to have s3auth.com retrieve and render it.

Traction

I created this service mostly for myself, but apparently I'm not the only with the problems described above. At the moment, s3auth.com hosts over 300 domains and sends through more than 10Mb of data each hour.

PS. This post explains how s3auth.com can be used as a front-end to your Maven repository: How to Set Up a Private Maven Repository in Amazon S3.

© Yegor Bugayenko 2014–2018

How Hourly Rate Is Calculated

QR code

How Hourly Rate Is Calculated

  • comments
badge

In XDSD, everyone---including project managers, analysts, programmers, and product owners---receives payments based on deliverables with agreed upon budgets. In the fhe first section of the article, How XDSD Is Different I explain exactly how this concept works. I don't explain in the article, though, how we decide which hourly rate is acceptable for each project participant.

When new people come to us, usually they have some numbers in mind. They know how much they expect to make per week, per month or per day. We rarely negotiate the payment rates, but rather just accept reasonable offers (see How Much Do You Cost?). Nonetheless, every few months, we review payments rates and change them accordingly (increasing or decreasing them as appropriate).

Further along in the article, is a list of factors that influence our decision making process regarding payment rates. However, before we get to the factors that influence our rate-setting decisions, it is important to mention that---unlike most other companies or software teams---we don't pay attention to the following:

  • Your geographic location;
  • Skills and experience listed in your CV;
  • Amount of time already spent on our projects;
  • Age, sex, nationality, religious beliefs, etc.

The factors listed below, though, are indeed very important to us. They affect your "overall score" significantly and play a major part in decisions to decrease or increase a payment rate. After changing a payment rate, we don't negotiate it with the project member.

Keep in mind that besides decreasing your hourly rate, a low overall score may affect the number of tasks you receive from us.

The best developers receive most of the new tasks. So, continue reading, follow our principles and learn how to earn and enjoy higher rates :)

Fast Delivery

The faster you deliver on a task, the better. We track all your completed tasks and can calculate easily how many days it takes you, on average, to close tasks. To increase this metric, you should try to close all tasks as soon as possible to reduce your overall completion-time average.

If you see that a specific task is not suitable for you, don't hold on to it. Instead, inform your project manager as soon as possible that you do not want to work on the task. After you inform the project manager, he will try find you something else more suitable.

By the way, the best developers usually close their tasks in five calendar days (or less) on average.

Past Due Tasks

Though we encourage everyone to reject tasks they don't like, we are strongly against overdue tasks. Once you have started to work on a task, we expect you to finish it on time.

Our No Obligation Principle gives our project managers freedom to take any task away from you if don’t complete it in a reasonable amount of time (ten days).

Removal of tasks by project managers affects your overall score negatively. Nevertheless, even the best developers sometimes have overdue tasks, and we understand that it happens from time to time. However, our best developers they keep their number of overdue tasks to a minimum. A good rule of thumb for acceptable numbers in this area is about one overdue task per twenty completed successfully and on time.

Complexity

Every XDSD task has a project role assigned to it. The article, Puzzle Driven Development by Roles, lists the key roles we use in XDSD projects. Generally speaking, the higher the role, the higher the complexity of tasks assigned to it. Therefore, closing a task in an "architect" role is much more important than closing one as an "implementer" (or "developer.")

The more tasks you close in your current role, the faster you will receive promotions and receive pay-rate increases. Very often, our developers work in a few roles at the same time.

Lengthy Discussions

We discourage long conversations on one task. The longer the discussions about a task, the longer it takes to complete---which lowers your quality as a developer. Ideally, developers should receive a task, deliver the result and inform the task author after it's done. Afterwards, the task author closes the task and payment is made.

We track the number of messages you post and receive in your tasks automatically. Consequently, too many messages may affect your overall score in a negative way.

To avoid long conversations in tasks, submit new tickets with questions or bug reports. Again, the Puzzle Driven Development by Roles article explains the whole idea of helping us "to break the project" by submitting new bugs. Follow this concept and you'll be fine.

Contribution via Bugs

In XDSD Bugs Are Welcome. You are supposed to report bugs along the normal development activities. Besides receiving extra money for reporting bugs, you can also increase your overall rating.

The best developers submit one bug for every 2 to 3 tasks they complete.

© Yegor Bugayenko 2014–2018

Mocking of HTTP Server in Java

QR code

Mocking of HTTP Server in Java

  • comments
badge

Recently, I explained a fluent Java HTTP client created (mostly) to make HTTP interactions more object-oriented than with other available clients,including: Apache Client, Jersey Client and plain old HttpURLConnection.

This client ships in the jcabi-http Maven artifact. However, the client part is not the only benefit of using jcabi-http. Jcabi also includes a server component that can help you in unit and integration testing of your HTTP clients.

Let me show you an example first. In the example, I'm using hamcrest for assertions.

MkContainer container = new MkGrizzlyContainer()
  .next(new MkAnswer.Simple("hello, world!"))
  .start();
try {
  new JdkRequest(container.home())
    .header("User-agent", "Myself")
    .fetch()
    .assertBody(Matchers.containsString("hello"));
} finally {
  container.stop();
}
MkQuery query = container.take();
MatcherAssert.assertThat(
  query.headers().get("User-agent"),
  Matchers.hasItem("Myself")
);

Now, let's discover what happens here.

In the first few lines, I create an instance of MkContainer, which literally has four methods: next(MkAnswer), start(), stop(), and home().

It works as an HTTP server with a "first-in-first-out" queue for HTTP answers. We add answers, and the server returns them in response to HTTP requests.

The server starts on start() call and stops on stop(). Its method home() returns a URL of its "home page." The server then binds itself to a randomly allocated TCP port.

The container finds the first available and unoccupied port.

In the example above, I added just one answer. This means that the container will reply only to the first HTTP request with that answer and that all consecutive requests will cause HTTP responses with status "internal server error 500."

In lines 5 through 8, I make an HTTP request to the already started server. Also, I make an assertion that the body of the HTTP response contains the text "hello". Obviously, this assertion will pass because the server will return "hello, world!" to my first request:

new JdkRequest(container.home())
  .header("User-agent", "Myself")
  .fetch()
  .assertBody(Matchers.containsString("hello"));

As you can see, I use container.home() in order to get the URL of the server. It is recommended that you allow the container to find the first unoccupied TCP port and bind itself to it. Nevertheless, if you need to specify your own port, you can do it with a one-argument method start(int) in MkContainer.

I use try/finally to stop the container safely. In unit tests, this is not critical, as you can simplify your code and never stop the container. Besides, the container will be killed together with the JVM. However, for the sake of clarity, I would recommend you stop the container in the finally block.

On line 12, I ask the stopped container to give me the first request it received. This mechanism is similar conceptually to the "verify" technology of mocking frameworks. For example, Mockito.

MkQuery query = container.take();
MatcherAssert.assertThat(
  query.headers().get("User-agent"),
  Matchers.hasItem("Myself")
);

An instance of MkQuery exposes information about the query made. In this example, I get all headers of the HTTP request and making an assertion that the"User-Agent" header was there and had at least one value equal to "Myself".

This mocking technology is used actively in unit and integration tests of jcabi-github, which is a Java client to GitHub API. In its development, the technology is very important in checking which requests are being sent to the server and validating whether they comply with our requirements. Here, we are using jcabi-http mocking.

As with the client, you need the jcabi-http.jar dependency (get its latest versions in Maven Central):

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-http</artifactId>
</dependency>

Besides the above, you need to add one more dependency, which is a Grizzly HTTP server. MkGrizzlyContainer is based on it.

<dependency>
  <groupId>com.sun.grizzly</groupId>
  <artifactId>grizzly-servlet-webserver</artifactId>
  <scope>test</scope>
</dependency>

If you have any questions or suggestions, please submit them through GitHub issues. As always, bugs are welcome :)

© Yegor Bugayenko 2014–2018

How XDSD Is Different

QR code

How XDSD Is Different

  • comments
badge

eXtremely Distributed Software Development, or XDSD for short, is a methodology that differs significantly from working in traditional software development teams. Most XDSD methods are so different (yet critical) that many newcomers get confused. This article should help you bootstrap once you join a project managed with by XDSD principles---either as a developer or a project sponsor.

Revolver (2005) by Guy Ritchie
Revolver (2005) by Guy Ritchie

We Pay Only For Closed Tasks

Unlike with many other projects, in XDSD, we pay only for closed tasks and the agreed upon time budget. Let me explain by example. Let's say, you are a Ruby programmer and you a get a new task that requires you to fix a broken unit test. The task has a time budget of 30 minutes, as is the case most of the time. Sometimes, though, tasks may have time budgets of fifteen minutes or one hour.

In our example, we agree upon a contract rate of $50 per hour. With the broken test, you will receive $25 for completing the task---30 minute tasked billed at $50 per hour.

It does not matter how long it actually takes you to fix the test. Your actual time spent on the project may be five minutes or five hours. Nevertheless, you will receive compensation for 30 minutes of work only. If you fix the broken test in 5 minutes, you receive $25. If the task takes you an hour, or even a month, to complete, you still receive only $25.

Furthermore, if you fail to fix the unit test and close the task altogether, you will receive no pay at all for the assignment.

You can view more details about this principle in the following articles: No Obligations Principle or Definition of Done.

As mentioned above, this is one of the most important differences between XDSD and other methods. Many people get confused when they see this principle in action, and some leave our projects because of it. They simply are used to being paid by the end of the month---no matter how much work they actually deliver. In XDSD, we consider this type of approach very unfair. We feel that people who deliver more results should receive more cash. Conversely, those who don't deliver should get less.

We Deliver Unfinished Components

Since most of our tasks are half an hour in size, we encourage developers to deliver unfinished components. Read more about this concept in the article below: Puzzle Driven Development.

No Informal Communications

Unlike many other projects or teams you may have worked with, XDSD uses no informal communication channels. To clarify, we never use emails, we never chat on Skype and we don't do any meetings or phone calls. Additionally, XDSD maintains no type mailing list. Our only method of communication is a ticket tracking system (which in most projects consists of GitHub Issues.)

Moreover, we discourage horizontal communications between developers regarding the scope of individual tasks. When assigned a task, your single and only point of contact (and your only customer) is the task author. You communicate with the author in the ticket to clarify task requirements.

When the requirements of a task are clear---and you understand them fully---deliver the result to the author and wait for him to close the task. After the author closes the task, the project manager pays you.

Goodfellas (1990) by Martin Scorsese
Goodfellas (1990) by Martin Scorsese

We're very strict about this principle---no informal communications. However, it doesn't mean that we are not interested in your opinions and constructive criticism. Rather, we encourage everyone to submit their suggestions and bugs. By the way, we pay for bugs (see the next section for further details about bug reporting and payments.)

Since we have no formal communications, members of project teams are not required to work at specific times. Instead, team members work at times convenient for them in their time zones. This includes weekdays and weekends.

We Pay For Bugs

Unlike many other software teams, XDSD welcomes bug reports in all our projects. Therefore, we ask for bugs openly and expect team members to report them. Review the following article for complete details on XDSD bug reporting: Bugs are welcome

We expect everyone involved with a project to report every bug found. Additionally, we encourage team members to make suggestions. In XDSD, we pay team members for every properly reported bug.

XDSD makes payments for reported bugs because we believe that the more of them we can find, the higher the quality of the end product. Some new developers are surprised when they receive tasks such as "you must find 10 bugs in class A." Often, the natural reaction is to ask "what if there are no bugs?" However, we believe that any software product may have an unlimited amount of bugs; it is just a matter of expending the time and effort needed to discover them.

Only Pull Request

We never grant team member access to the master branch---no matter how long you work on a project. Consequently, you must always submit your changes through pull requests (most of our projects are done in GitHub.)

We enforce this policy not because we don't trust our developers, but simply because we don't trust anyone :) Read this article: Master Branch Must Be Read-Only.

No Compromises About Code Quality

Before merge any changes to the master branch, we check the entire code base with unit tests and static analyzers. Unit testing is a very common component in modern software development, and one by which you should not be surprised. However, the strictness of static analysis is something that often frustrates XDSD newcomers, and we understand that. We pay much more attention to the quality and uniformity of our source code than most of our competing software development teams.

Even more important is that we never make compromises. If your pull request violates even one rule of the static analyzer, it won't be accepted. And, it doesn't matter how small or innocent that violation may look. This merging process is fully automated and can't be bypassed.

© Yegor Bugayenko 2014–2018

GitHub Guidelines

QR code

GitHub Guidelines

  • comments

This manual explains the workflow used when working with a XDSD project hosted on GitHub. You start when a GitHub issue is assigned to you. Next, you will receive a message from a project manager containing the issue number, title, description and its budget in hours (usually 30 minutes).

If you don't agree with the budget allotment, don't hesitate to ask for an increase. As soon as you are comfortable with the budget and understand the scope of the work, say so in a reply to the ticket and start working. Be aware that you won't be paid for time spent above and beyond the allotted time budget.

1. Fork

Even though you're part of the development team, you don't have write access to the repository in GitHub. Consequently, to contribute changes, you should fork the repository to your own GitHub account (create a private copy of it), make needed changes and then submit them for review using "a pull request."

After you submit a pull request review, the repository owner approves your changes by merging them into the main repository. This is how we protect the main development stream against accidental damage.

This article explains how to fork a repository: fork-a-repo

This one explains how to download and install GitHub on your computer: set-up-git

Finally, don't forget to add your private SSH key to GitHub: generating-ssh-keys

2. Branch

Once you have a forked our repository to your account, clone it to your computer, and then check out the master branch. For example:

git clone git@github.com:yegor256/xembly.git
git checkout master

Now, it's time to branch (123 is the number of the GitHub issue you're going to work with, and the name of the branch):

git checkout -b 123

By convention, we use the same names for the branch and issue you're working with.

3. Changes

All task-related questions should be discussed in the GitHub issue. For GitHub issues, we don't use emails, Skype, phone calls or meetings. All questions should be asked directly in the GitHub issues.

Don't hesitate to submit new issues if something is not clear or you need help. It's a very common to receive a task that you may not be able to implement. Don't panic. This usually happens when you first just join a project and don't yet have enough information. If this happens, don't try to figure out a problem or issue by yourself.

The rule of thumb for this type of situation is: "If something is not clear, it is our fault, not yours." Therefore, if you don’t understand the project design, it is the fault of the project designer.

Submit a bug report requesting an explanation of a design concept. You will be paid for this report, and the information you receive in the reply will be shared between all other developers.

Read this article: Bugs Are Welcome.

Don't expect anyone to help you. Your only source of help is the source code itself. If the code doesn't explain everything you need to know---it is a bug, which must be reported.

4. Commit and Push

Make any needed changes using a text editor or IDE. It's a good practice to commit changes as soon as you make them. Don't accumulate large numbers of changes too long before committing them.

git commit -am '#123: the description of the changes'
git push origin 123

If you have questions about the scope of work, post them in the GitHub issue and wait for an answer. If you think that the existing code needs improvements, don't hesitate to submit a new issue to GitHub.

Don't try to fix all problems in one branch; let other programmers take care of them.

5. Pull Request

Create a pull request in GitHub using the process in the following article: using-pull-requests

Post its number in the original issue and wait for feedback.

6. Code Review

After a while, your pull request will be reviewed by someone from the project team. In many cases, you may receive a few negative comments, and you will have to fix any and all issues associated with them. Your pull request won't be merged into master branch, until your changes satisfy the reviewer.

Be patient with the reviewer, and listen to him carefully. However, don't think that your reviewer is always right. If you think that your changes are valid, insist that someone else review them.

7. Merge

When everything looks good to the reviewer, he will inform our automated merge bot. The automated merge bot will then select your pull request and try to merge it into master branch. For various reasons, this operation fails often. If the merge fails, regardless of the reason, it is your responsibility to make sure that your branch is merged successfully.

If you can't merge a branch because of failures in tests not associated with your task, don't try to fix them yourself. Instead, report a problem as a new bug and wait for its resolution.

Remember, until your branch is merged, you are not paid.

8. Payment

Once your changes are merged, return to the GitHub issue and ask the author to close it. Once the issue is closed by a project manager, you will receive your payment within a few hours.

© Yegor Bugayenko 2014–2018

Definition Of Done

QR code

Definition Of Done

  • comments
badge

Definition of Done (DoD) is a key definition used in Scrum and the one we also use in XDSD. DoD is an exit criteria of a simple atomic task and answers the question:"am I done with this task?" Moreover, DoD answers the question: "will I be paid for the task?" In XDSD, the definition of "done" is very simple---the task is done iff its author accepts the deliverables.

At XDSD, our first and most important principle states that someone is paid only when they provide deliverables. Combining the definition of done and the principle of paying only for deliverables provides us a very important conclusion: we do not pay for un-finished tasks.

Every task has its own time budget. Regardless of the number of people who worked on a task previously, only the last one---the one who managed to provide a working deliverable---receives payment.

To better understand this principle, you should read: No Obligations Principle.

Your goal as a developer working on a task should be to close it and receive payment as soon as possible. To that end, here are few things that can help you complete tasks and receive payments without too much frustration:

  • Don't even start a task unless you're sure you can finish it;
  • Ask any and all questions of the task author in advance (before beginning work);
  • Don't assume anything---ask if you're not sure;
  • Stay after the author to close tasks---be aggressive, no matter who is he/she;
  • Don't expect any help from anyone---you're on your own;
  • Ask PM about payment if you don't receive it automatically after an author closes your task(s)

It is important to remember that, as a developer, it is your responsibility to ensure that tasks are closed and you receive payment.

© Yegor Bugayenko 2014–2018

Object-Oriented DynamoDB API

QR code

Object-Oriented DynamoDB API

  • comments
badge

I'm a big fan of cloud computing in general and of Amazon Web Services in particular. I honestly believe that in a few years big providers will host all, or almost all, computing and storage resources. When this is the case, we won't have to worry too much anymore about downtime, backups and system administrators. DynamoDB is one of the steps towards this future.

DynamoDB is a NoSQL database accessible through RESTful JSON API. Its design is relatively simple. There are tables, which basically are collections of data structures, or in AWS terminology, "items."

Every item has a mandatory "hash," an optional "range" and a number of other optional attributes. For instance, take the example table depts:

+------+--------+---------------------------+
| dept | worker | Attributes                |
+------+--------+---------------------------+
| 205  | Jeff   | job="manager", sex="male" |
| 205  | Bob    | age=43, city="Chicago"    |
| 398  | Alice  | age=27, job="architect"   |
+------+--------+---------------------------+

For Java, Amazon provides an SDK, which mirrors all RESTful calls to Java methods. The SDK works fine, but is designed in a pure procedural style.

Let's say we want to add a new item to the table above. RESTful call putItem looks like (in essence):

putItem:
  tableName: depts
  item:
    dept: 435
    worker: "William"
    job: "programmer"

This is what the Amazon server needs to know in order to create a new item in the table. This is how you're supposed to make this call through the AWS Java SDK:

PutItemRequest request = new PutItemRequest();
request.setTableName("depts");
Map<String, AttributeValue> attributes = new HashMap<>();
attributes.put("dept", new AttributeValue(435));
attributes.put("worker", new AttributeValue("William"));
attributes.put("job", new AttributeValue("programmer));
request.setItem(attributes);
AmazonDynamoDB aws = // instantiate it with credentials
try {
  aws.putItem(request);
} finally {
  aws.shutdown();
}

The above script works fine, but there is one major drawback---it is not object oriented. It is a perfect example of an imperative procedural programming.

To allow you to compare, let me show what I've done with jcabi-dynamo. Here is my code, which does exactly the same thing, but in an object-oriented way:

Region region = // instantiate it with credentials
Table table = region.table("depts");
Item item = table.put(
  new Attributes()
    .with("dept", 435)
    .with("worker", "William")
    .with("job", "programmer")
);

My code is not only shorter, but it also employs encapsulation and separates responsibilities of classes. Table class (actually it is an interface internally implemented by a class) encapsulates information about the table, while Item encapsulates item details.

We can pass an item as an argument to another method and all DynamoDB related implementation details will be hidden from it. For example, somewhere later in the code:

void sayHello(Item item) {
  System.out.println("Hello, " + item.get("worker"));
}

In this script, we don't know anything about DynamoDB or how to deal with its RESTful API. We interact solely with an instance of Item class.

By the way, all public entities in jcabi-dynamo are Java interfaces. Thanks to that, you can test and mock the library completely (but I would recommend to use DynamoDB Local and create integration tests).

Let's consider a more complex example, which would take a page of code if we were to use a bare AWS SDK. Let's say that we want to remove all workers from our table who work as architects:

Region region = // instantiate it with credentials
Iterator<Item> workers = region.table("depts").frame()
  .where("job", Condition.equalTo("architect"));
while (workers.hasNext()) {
  workers.remove();
}

jcabi-dynamo has saved a lot of code lines in a few of my projects. You can see it in action at rultor-users.

The library ships as a JAR dependency in Maven Central (get its latest versions from Maven Central):

<dependency>
  <groupId>com.jcabi</groupId>
  <artifactId>jcabi-dynamo</artifactId>
</dependency>

© Yegor Bugayenko 2014–2018

No Obligations

QR code

No Obligations

  • comments
badge

It is a very common problem in project management---how to make team members more responsible and avoid micro management?

We start with creating plans, drawing Gantt charts, announcing milestones, motivating everybody and promising big bonuses on success.

Excuses

Then everybody begins working and we start hearing excuses:

  • "The task is not yet ready. I was doing something else"
  • "May I take a day off? Tomorrow is my birthday?"
  • "May I skip the unit test because I don't know how to fix it?"
  • "I don't know how to do it, can someone help me?"
  • "I tried, but this doesn't work; what can I do?"
  • "This installation requires all of my time. I can't finish the task"

With excuses, team members transfer responsibility back to the project manager. There was a very famous article "Management Time: Who's Got the Monkey?" published in the Harvard Business Review about this very subject.

I recommend that you read it. Its authors present problems as monkeys sitting on our shoulders. When the project manager assigns a task to a programmer---he moves the monkey from his shoulders to the programmer's shoulders.

The programmer usually presents the excuse "I don't know what to do." Now the monkey is back on the shoulders of the managers. The goal of the manager is to send the monkey back to make it the programmer's problem again.

One of traditional way of transferring responsibility back to team members is to become an aggressive manager. For instance the manager may say, "You have a birthday tomorrow? I don't care, you still have to meet your deadline" or "You don't know how to fix the unit test? Not my problem, it should be fixed by tomorrow," etc.

We've all seen multiple examples of that type of aggressive management. Personally, I find this management style extremely annoying and destructive for the project. The project environment becomes very unhealthy and good people usually end up leaving.

Another traditional management method is micro-management. This results when the project manager checks task statuses every few hours and tells people what to do and how to handle problems. Needless to say, this management style ruins the team and causes good people to leave even faster.

However, in order to keep the project on track and meet all milestones, responsibility must be on the shoulders of the team members. They should be responsible for their own tasks and report back to the project manager when they are finished with their jobs.

The Big Lebowski (1998) by Joel Coen
The Big Lebowski (1998) by Joel Coen

Implementation problems should be solved by team members on their own. So, how do we accomplish this in XDSD?

I Owe You Nothing

In XDSD, there is the first fundamental principle that says everybody should be paid for deliverables. Based on this idea, we can go even further and declare a "No Obligations" principle.

In essence, for every team member, it says: if you don’t like the task assigned to you, don’t have time or you’re simply not in the mood---don't do it.

You have no obligation to do anything. You're free to reject every second task that a project manager gives to you or even all of them.

On the other hand, though, the project manager is not obliged to keep a task assigned to you for longer than 10 days (we think that this time frame is logical).

If you get a task, and don't deliver within ten days, the project manager can take it away and pay you nothing---no matter how much time you invested in the task already or the reasons for your failure to complete it.

Where Are The Monkeys Now?

This principle helps us to separate responsibilities between project manager and team members. The manager is responsible for finding the right people and assigning them appropriate tasks. There is a problem with the project manager's management style if he receives too many rejections from the team.

On the other hand, his team members are responsible for their tasks and should not provide excuses for non-completion. Well, team members can make excuses, but they won't change anything. No matter what their excuses are, the deliverables will be purchased only from members who manage to complete their tasks on time.

How Does This Affect Me?

When you're working with XDSD-inspired project, you should always keep the "No Obligations" principle in mind. You should start a task only if you're sure that you can finish it in a few days. You should pursue your tasks and control deadlines yourself. The project manager will not ask you for status updates, as usually happens with traditional projects. He will just take the task away from you after ten days if you don’t finish it. To avoid that, you should control your tasks and their deadlines.

With every task, try to be as lazy as possible and cut every corner you can. The smaller the amount of work you perform on a task, the easier it will be to deliver it and pass all quality controls.

Always remember that your efforts are not appreciated---only the deliverables matter.

© Yegor Bugayenko 2014–2018

Bugs Are Welcome

QR code

Bugs Are Welcome

  • comments

The traditional understanding of a software defect (aka "bug") is that it is something negative and want to avoid in our projects. We want our projects to be "bug-free." Our customers are asking us to develop software that doesn't have bugs. And, we, as users, expect software to work without bugs.

Charlie and the Chocolate Factory (2005) by Tim Burton
Charlie and the Chocolate Factory (2005) by Tim Burton

But, let's take a look at bugs from a different angle. In XDSD, we say that "bugs are welcome." This means we encourage all interested parties to find bugs and report them. We want our team to see bugs as something that we need in our projects. Why?

badge

Because we understand that there are two categories of bugs: visible and hidden. The more bugs that become visible, the more of them we can fix. More fixed bugs means fewer to annoy our users. By discovering bugs we make them visible.

This is the primary job of a software tester---to make bugs visible.

Obviously, their visibility affects the quality of the product in a positive way. This is because we can fix them before our users start complaining.

In order to motivate all team members to make more bugs visible, we pay for their discovery. In XDSD projects, we are pay 15 minutes for every bug found (no matter who finds them and where.)

We Plan Bugs

We go even further. At XDSD, we plan for a number of hidden bugs in every project. We do this by using our experience with previous projects and expert judgment.

Let's say we're starting to develop a web system, which is similar to the one we worked on last year. We know that in the previous project our users and team together reported 500 bugs.

It's logical to assume that the new project will have a similar number of bugs. Thus, our task is to make those 500 bugs visible before they hit the production platform and our users call us to complain about them. Therefore, we're making it one of the project goals: "discover 500 bugs."

Of course, our estimate may be wrong. Nevertheless, we have historical records for a few dozen projects, and in all of them the number is close to 500. So, finding 500 bugs in a project is usually a reality---we can use it as a target.

What Is a Bug?

Let us try to define a bug (or software defect) in a non-ambiguous manner. Something can be reported as a bug and subsequently paid for iff:

  • it is reproducible
  • it refers to functionality already implemented
  • it can be fixed in a reasonable amount of time
  • it doesn't duplicate a bug already reported

Reproducibility of a bug is very important. Consequently, it is the responsibility of a bug reporter to make sure the bug is reproducible. Until it is proven that the bug can be reproduced---it's not a bug for which payment can be made.

A bug is not a task; it has to refer to an existing functionality. Additionally, an explanation must exist for how and when the existing functionality doesn't work as expected.

© Yegor Bugayenko 2014–2018

PDD by Roles

QR code

PDD by Roles

  • comments
badge

In this post, I'll try to walk you through a project managed with the spirit of Puzzle Driven Development (PDD). As I do this, I will attempt to convey typical points of view of various project members.

Basically, there are a few key roles in any software team:

  • Project Manager---assigns tasks and pays on completion
  • System Analyst---documents the product owner's ideas
  • Architect---defines how system components interact
  • Designer---implements most complex components
  • Programmer---implements all components
  • Tester---finds and reports bugs

Everybody, except the project manager, affects the project in two ways: they fix it and they break it at the same time. Let me explain this with a simple example.

Fix and Break

Let's assume, for the sake of simplicity, that a project is a simple software tool written by me for a close friend. I created the first draft version 0.0.1 and delivered it to him. For me, the project is done. I've completed the work, and hopefully will never have to return to it again.

However, the reality of the project is very different. In just a few hours, I receive a call from my friend saying that a he's found a few bugs in the tool. He is asking me to fix them. Now, I can see that the project is not done. In fact, it's broken. It has a few bugs in it, which means a few tasks to complete.

I'm going to fix the project, by removing the bugs. I implement a new version of the software, name it 0.0.2 and ship it to my friend. Again, I believe my project is finished. It is fixed and should be closed.

This scenario repeats itself again and again until my friend stops calling me. In other words, until he stops breaking my project.

It is obvious that the more my friend breaks my project, the higher the quality of the software delivered ultimately at the end. Version 0.0.1 was just a very preliminary version, although I considered it final at the time I released it. In a few months, after I learn of and fix hundreds of bugs, version 3.5.17 will be much more mature and stable.

This is the result of this "fix and break" approach.

picture

The diagram shows the relation between time and mess in the project. The bugs my friend is reporting to me are breaking the project, increasing its instability (or simply its messiness). New versions I release resolve the bugs and are fixing the project. Your GitHub commit dynamics should resemble this graph, for example:

picture

When the project starts, its messiness is rather low, and then it starts to grow. The messiness then reaches its peak and starts to go down.

Project Manager

The job of a project manager is to do as much as possible to fix the project. He has to use the sponsor's time and money in order to remove all bugs and inconsistencies and return the project back to a "fixed" state.

Pulp Fiction (1994) by Quentin Tarantino
Pulp Fiction (1994) by Quentin Tarantino

When I say "bugs," I mean more than just software errors but also:

  • unclear or ambiguous requirements
  • features not yet implemented
  • functional and non-functional bugs
  • lack of test coverage
  • unresolved @todo markers
  • lack of risk analysis
  • etc.

The project manager gives me tasks that he wants done in order to fix and stabilize the project to return it back to a bug-free state.

My job, as a member of a software team, is to help him perform the needed fixes and, at the same time, do my best to break the project! In the example with my friend, he was breaking the project constantly by reporting bugs to me. This is how he helped both of us increase the final quality of the product.

I should do the same and always try to report new bugs when I'm working on some feature. I should fix and break at the same time.

Now let's take a closer look at project roles.

System Analyst

A product owner submits an informal feature request, which usually starts with "it would be nice to have..." I'm a system analyst and my job is to translate owner's English into formal specifications in the SRS, understandable both by programmers and myself. It's not my responsibility to implement the feature.

Arizona Dream (1992) by Emir Kusturica
Arizona Dream (1992) by Emir Kusturica

My task is complete when a new version of the SRS is signed by the Change Control Board. I'm an interpreter for the product owners, translating from their language to formal language needed in the SRS document. My only customer is the product owner. As soon as she closes the feature request, I'll be paid.

Besides feature requests from product owners, I often receive complaints about the quality of the SRS. The document may not be clear enough for some team members. Therefore, it's my job to resolve clarity problems and fix the SRS. These team members are also my customers. When they close their bug reports, I'll be paid.

In both cases (a feature request or a bug,) I can make changes to the SRS immediately - if I have enough time. However, it's not always possible. I can submit a bug and wait for its resolution; but, I don't want to keep my customers waiting.

This is where puzzle driven development helps me. Instead of submitting bug reports, I add "TBD" puzzles in the SRS document. The puzzles are informal replacements of normally very strict formal requirements. They satisfy my customer, since they are in plain English, and are understandable by technical people.

Thus, when I don't have time, I don't wait. I change the SRS using TBD-s at points where I can't create a proper and formal description of the requirements or simply don't know what to write exactly.

Architect

Now, I'm the architect, and my task is to implement a requirement, which has been formally specified in the SRS. PM is expecting a working feature from me, which I can deliver only when the architecture is clear and classes have been designed and implemented.

The Science of Sleep (2006) by Michel Gondry
The Science of Sleep (2006) by Michel Gondry

Being an architect, I'm responsible for assembling all of the components together and making sure they fit. In most cases, I'm not creating them myself, but I'm telling everybody how they should be created. My work flow of artifacts is the following:

PlantUML SVG diagram

I receive requirements from the SRS, produce UML diagrams and explain to designers how to create source code according to my diagrams. I don't really care how source code is implemented. I'm more concerned with the interaction of components and how well the entire architecture satisfies functional and non-functional (!) requirements.

My task will be closed and paid when the system analyst changes its state to "implemented" in the SRS. The system analyst is my only customer. I have to sell my solution to him. Project manager will close my task and pay me when system analyst changes the status of the functional requirement from "specified" to "implemented."

The task sounds big, and I have only half an hour. Obviously, puzzle driven development should help me. I will create many tickets and puzzles. For example:

  • SRS doesn't explain requirements properly
  • Non-functional requirements are not clear
  • UML diagrams are not clear enough
  • Components are not implemented
  • Build is not automated
  • Continuous integration is not configured
  • Quality of code is not under control
  • Performance testing is not automated

When all of my puzzles are resolved, I can get back to my main task and finish feature implementation. Obviously, this may take a long time - days or even weeks.

But, the time cost of the main task is less than an hour. What is the point of all this hard work? Well, it's simple; I'll earn my hours from all the bugs reported. From this small half-an-hour task, I will generate many tickets, and every one of them will give me extra cash.

Designer and Programmer

The only real differences between designer and programmer are the complexity of their respective tasks and the hourly rates they receive. Designers usually do more complex and higher level implementations, while programmers implement all low-level details.

Pulp Fiction (1994) by Quentin Tarantino
Pulp Fiction (1994) by Quentin Tarantino

I'm a programmer and my task is to implement a class or method or to fix some functional bug. In most cases, I have only half an hour available. And, most tasks are bigger and require more time than that.

Puzzle driven development helps me break my task into smaller sub-tasks. I always start with a unit test. In the unit test, I'm trying to reproduce a bug or model the feature. When my test fails, I commit it and determine the amount of time I have left. If I still have time to make it pass---I do it, commit the changes and report to the project manager.

If I don't have time to implement the fix, I mark pieces of code that don't already have @todo markers, commit them and report to the project manager that I've finished.

As you see, I'm fixing the code and breaking it at the same time. I'm fixing it with my new unit test, but breaking it with @todo puzzles.

This is how I help to increase the overall quality of the project - by fixing and breaking at the same time.

Tester

I'm a tester and my primary motivation is to find bugs. This may be contradictory to what you've heard before; but in XDSD, we plan to find a certain amount of bugs at every stage of the project.

Fear and Loathing in Las Vegas (1998) by Terry Gilliam
Fear and Loathing in Las Vegas (1998) by Terry Gilliam

As as a tester, I receive tasks from my project manager. These tasks usually resemble "review feature X and find 10 bugs in it." The project manager needs a certain number of bugs to be found in order to fix the project. From his point of view, the project is fixed when, say, 200 bugs have been found. That's why he asks me to find more.

Thus, to respond to the request, i find bugs to do my part in regard to the "fixing" part of the bigger picture. At the same time, though, I can find defects on my own and report them. This is the "breaking" part of my mission.

© Yegor Bugayenko 2014–2018

Fluent Java HTTP Client

QR code

Fluent Java HTTP Client

  • comments
badge

In the world of Java, there are plenty of HTTP clients from which to choose. Nevertheless, I decided to create a new one because none of the other clients satisfied fully all of my requirements. Maybe, I'm too demanding. Still, this is how my jcabi-http client interacts when you make an HTTP request and expect a successful HTML page in return:

String html = new JdkRequest("https://www.google.com")
  .uri().path("/users").queryParam("id", 333).back()
  .method(Request.GET)
  .header("Accept", "text/html")
  .fetch()
  .as(RestResponse.class)
  .assertStatus(HttpURLConnection.HTTP_OK)
  .body();

I designed this new client with the following requirements in mind:

Simplicity

For me, this was the most important requirement. The client must be simple and easy to use. In most cases, I need only to make an HTTP request and parse the JSON response to return a value. For example, this is how I use the new client to return a current EUR rate:

String uri = "http://www.getexchangerates.com/api/latest.json";
String rate = new JdkRequest(uri)
  .header("Accept", "application/json")
  .fetch()
  .as(JsonResponse.class)
  .json().readArray().getJsonObject(0)
  .getString("EUR");

I assume that the above is easy to understand and maintain.

Fluent Interface

The new client has to be fluent, which means that the entire server interaction fits into one Java statement. Why is this important? I think that fluent interface is the most compact and expressive way to perform multiple imperative calls. To my knowledge, none of the existing libraries enable this type of fluency.

Testable and Extensible

I'm a big fan of interfaces, mostly because they make your designs both cleaner and highly extensible at the same time. In jcabi-http, there are five interfaces extended by 20 classes.

Request is an interface, as well as Response, RequestURI, and RequestBody exposed by it.

Use of interfaces makes the library highly extensible. For example, we have JdkRequest and ApacheRequest, which make actual HTTP calls to the server using two completely different technologies: (JDK HttpURLConnection and Apache HTTP Client, respectively). In the future, it will be possible to introduce new implementations without breaking existing code.

Say, for instance, I want to fetch a page and then do something with it. These two calls perform the task differently, but the end results are the same:

String uri = "http://www.google.com";
Response page;
page = new JdkRequest(uri).fetch();
page = new ApacheRequest(uri).fetch();

XML and JSON Out-of-the-Box

There are two common standards that I wanted the library to support right out of the box. In most cases, the response retrieved from a server is in either XML or JSON format. It has always been a hassle, and extra work, for me to parse the output to take care of formatting issues.

jcabi-http client supports them both out of the box, and it's possible to add more formats in the future as needed. For example, you can fetch XML and retrieve a string value from its element:

String name = new JdkRequest("http://my-api.example.com")
  .header("Accept", "text/xml")
  .fetch()
  .as(XmlResponse.class)
  .xml().xpath("/root/name/text()").get(0);

Basically, the response produced by fetch() is decorated by XmlResponse. This then exposes the xml() method that returns an instance of the XML interface.

The same can be done with JSON through the Java JSON API (JSR-353).

None of the libraries that I'm aware of or worked with offer this feature.

Immutable

The last requirement, but certainly not the least important, is that I need all interfaces of the library to be annotated with @Immutable. This is important because I need to be able to encapsulate an instance of Request in other immutable classes.

ps. A short summary of this article was published at JavaLobby

© Yegor Bugayenko 2014–2018

How Much Do You Pay Per Line of Code?

QR code

How Much Do You Pay Per Line of Code?

  • comments
badge

Yes, I know, "line of code" (LoC) is a very wrong metric. There are tons of articles written about it, as well as famous books. However, I want to compare two projects in which I have participated recently and discuss some very interesting numbers.

Project #1: Traditionally Co-located

The first project I was a part of was performed by a traditionally co-located group of programmers. There were about 20 of them (I'm not counting managers, analysts, product owners, SCRUM masters, etc.) The project was a web auctioning site with pretty high traffic numbers (over two million page views per day).

The code base size was about 200k lines, of which 150k was PHP, 35k JavaScript and the remainder CSS, XML, Ruby, and something else. I'm counting only non-empty and non-comment lines of code, using cloc.pl.

It was a commercial project, so I can't disclose its name.

Brazil (1985) by Terry Gilliam
Brazil (1985) by Terry Gilliam

The team was co-located in one office in Europe where everybody was working "from nine 'til five." We had meetings, lunches, desk-to-desk chats and lots of other informal communications. All tasks were tracked in JIRA.

Project #2: Extremely Distributed

The second project was an open source Java product, developed by an extremely distributed team of about 15 developers. We didn't have any chats or any other informal communications. We discussed everything in GitHub issues. The code base was significantly smaller with only about 30k lines, of which about 90% was Java and the rest in XML.

Shaolin Temple (1982) by Chang Hsin Yen
Shaolin Temple (1982) by Chang Hsin Yen

Maturity of Development

Both projects hosted their code bases on GitHub. Both teams were developing in feature branches- even for small fixes.

Both teams used build automation, continuous integration, pre-flight builds, static analysis and code reviews. This indicates the maturity of the project teams.

Both projects satisfied the requirements of their users. I'm mentioning this to emphasize that both projects produced valuable and useful lines of code. There was no garbage and almost no code duplication.

Show Me the Money

In both projects, my role was called that of lead architect, and I knew their financial details. Besides that, I had access to both Git repositories, so I can measure how many new lines (or changed lines) were introduced by both teams in, say, a three-month period.

Now, let's see the numbers.

The first project (the one that was co-located) was paying approximately €50,000 annually to a good developer, which was about $5,600 per month or $35 per hour. The second one (the extremely distributed project) was paying $20-35 per hour, for completed tasks only according to one of the principles of XDSD.

The first one, in three months, produced 59k new lines and removed 29k in changes in the master branch, which totals 88k lines of code. The project resulted in about 10,000 man hours to produce these lines (20 programmers, three months, 170 working hours per month)---which equates to about $350k. Therefore, the project cost a whopping

$3.98 per line

The second project, in the same three month period, produced 45k new lines and removed 9k, which comes to 54k in all. To complete this work, we spent only $7k (approximately 350 working hours in 650 tasks). Thus, the project cost merely:

¢13 per line

This also means that programmers were writing approximately 150 lines per hour or over a thousand per day. The Mythical Man-Month talks about 10 lines per day, which is a hundred times less than we saw in our project.

$350k vs $7k, $3.98 vs ¢13? What do you think?

How to Validate the Numbers?

If you're curious, I'm using hoc to get the numbers from Git (it is explained in Hits-of-Code Instead of SLoC). You can validate the numbers for the second project here on GitHub: jcabi/jcabi-github.

Conclusion

What I'm trying to express with these numbers is that distributed programming is much more effective, money-wise, than a co-located team. Again, I can hear you saying that "line of code" is not a proper metric. But, come on, $0.13 vs. $3.98? Thirty times more expensive?

The Big Lebowski (1998) by Joel Coen
The Big Lebowski (1998) by Joel Coen

It's not about metrics any more. It's about preventing wasteful man hours and the huge waste of money that comes with them?

Can We Do the Same?

Of course, the same results can't be achieved by just telling your programmers to work from home and never come to the office. XDSD is not about that. XDSD is about strict quality principles, which should be followed by the entire team.

And when these principles are in place---you pay thirty times less.

By the way, this is what people say about their projects:

What are your numbers? Please post your comments below.

© Yegor Bugayenko 2014–2018

Xembly, an Assembly for XML

QR code

Xembly, an Assembly for XML

badge

I use XML in almost every one of my projects. And, despite all the fuss about JSON/YAML, I honestly believe that XML is one of the greatest languages ever invented. Also, I believe that the beauty of XML reveals itself when used in combination with related technologies.

For example, you can expose your data in XML and render it for the end-user using XSL stylesheet.

Another example would be when you validate the same data, before rendering, to ensure that the structure is correct. You can do this with the XSD schema. Alternatively, you can pick specific data elements from the entire document by using XPath queries.

Essentially, these three technologies, XSL, XSD schema and XPath, are what makes XML so powerful.

However, there can be times when XML falls short. For instance, imagine you have an existing document that needs to be modified just slightly. For example, let's use the following:

<accounts>
  [...]
  <acc id='34'>
    <name>Jeffrey</name>
    <balance>305</balance>
  </acc>
  <acc id='35'>
    <name>Walter</name>
    <balance>50090</balance>
  </acc>
  [...]
</accounts>

The above code represents a list of accounts. Each account has its own id and several child elements. In our example, we need to find the account belonging to Jeffrey and increase its balance by 500. How would we do this?

Well, there are a few possible solutions:

  • SAX-parse the document, change the balance and save the stream;
  • DOM-parse it, find the element with XPath, change the value and then print it;
  • apply a parametrized XSL stylesheet;
  • apply XQuery small script to make changes

All of these methods have their own drawbacks. However, all of them have one particular problem in common---they are very verbose. With each of the above methods, you need at least a page of code to perform this rather simple operation. Furthermore, if the logic of the operation becomes more complex, the amount of needed code grows much faster than you may expect.

Simply put, XML lacks a tool for primitive data manipulations within a document. Perhaps, it is this shortcoming that makes XML unpopular with some.

Anyway, here is a tool I created a few month ago: Xembly. It is an imperative language with a few simple directives and resembles Assembly in style. Thus, the name - Xembly. With Xembly, there are no loops, conditions or variables - just a sequence of directives with arguments.

Let's create a simple example. Say, for instance, we want to add a new account number 36 to our list document. The code would look like:

XPATH '/accounts';
ADD 'account';
ATTR 'id', '36';
ADD 'name';
SET 'Donny';
UP;
ADD 'balance';
SET '3400';

The above should be intuitively clear, but I'll explain just in case. First, the XPATH directive points us to the element found by the "/accounts" XPath query. This will be our root element. We assume here that it exists in the document. Therefore, if it is absent, our Xembly script will fail with a runtime exception.

Next, the ADD directive on line 2 creates a new XML element without any children or attributes. Then, the ATTR directive sets an attribute for this element. The code then adds the new child element name and sets its text value to "Donny" using the SET directive. Finally, we move our pointer back to account element using UP, add the balance child element and set its value to "3400".

Our balance changing task can be expressed in Xembly with the following code:

XPATH '/accounts/account[name="Jeffrey"]/balance';
XSET '. + 500';

The XSET directive sets the element text value, similar to SET, but calculates it beforehand using the provided XPath expression . + 500.

Xembly performs all manipulations through DOM. Consequently, Xembly can be implemented inside any language that has a built-in DOM implementation.

In the meantime, there is only one implementation of Xembly language---in Java. Here is how it works:

Iterable<Directive> directives = new Directives()
  .xpath("/accounts")
  .add("account")
  .attr("id", "36")
  .add("name").set("Donny").up()
  .add("balance").set("3400");
new Xembler(directives).apply(document);

In this snippet, I'm using a supplementary script builder, Directives, which enables generation of directives in a fluent way. Then, I use Xembler class, which is similar to "assembler," to apply all specified directives to the document object of class org.w3c.dom.Document.

Additionally, Xembly can be used to build XML documents from scratch and as a replacement for traditional DOM building. A quick example:

System.out.println(
  new Xembler(
    new Directives().add("html")
      .add("head")
      .add("title")
      .set("Hello, world!")
  ).xml()
);

The above snippet produces the following output:

<html>
  <head>
    <title>Hello, world!</title>
  </head>
</html>

For me, this appears to be more simple and compact.

As usual, your bug reports and suggestions are always welcomed. Please send to GitHub issues :)

© Yegor Bugayenko 2014–2018

PhantomJS as an HTML Validator

QR code

PhantomJS as an HTML Validator

badge

I created phandom.org a few months ago, but yesterday finally found the time to make some needed changes to it. So, now is a good time to explain how I'm using Phandom in some of my unit tests.

Before I get started, though, I should say a few words about phantomjs, which is a JavaScript interface for WebKit. WebKit, on the other hand, is a web browser without a user interface. WebKit is a C++ library that enables manipulation of HTML content, through DOM calls. For example, this is a simple JavaScript located code in example.js:

var page = require('webpage').create();
page.open(
  'http://google.com',
  function() {
    console.log('loaded!');
    phantom.exit(0);
  }
);

We run phantomjs from the command line with the following code:

$ phantomjs example.js

PhantomJS creates a page object (provided by webpage module inside phantomjs), and then asks it to open() a Web page. The object communicates with WebKit and converts this call into DOM instructions. After which, the page loads. The PhantomJS engine then terminates on line 6.

WebKit renders a web page with all necessary components such as CSS, JavaScript, ActionScript, etc, just as any standard Web browser would.

So far so good, and this is the traditional way of using PhantomJS. Now, on to giving you an idea of how Phandom (which stands for "PhantomJS DOM") works inside Java unit tests:

To test this, let's give phantomjs an HTML page and ask him to render it. When the page is ready, we'll ask phantomjs to show us how this HTML looks in WebKit. If we see the elements we need and desire,---we're good to go. Let's use the following example:

import com.rexsl.test.XhtmlMatchers;
import org.hamcrest.MatcherAssert;
import org.phandom.Phandom;
public class DocumentTest {
  @Test
  public void rendersValidHtml() {
    Document doc = new Document();
    // This is the method we're testing. It is supposed
    // to return a valid HTML without broken JavaScript
    // and with all required HTML elements.
    String html = doc.html();
    MatcherAssert.assertThat(
      XhtmlMatchers.xhtml(new Phandom(html).dom()),
      XhtmlMatchers.hasXPath("//p[.='Hello, world!']")
    );
  }
}

When we use the above code, here is what happens. First, we get HTML html as a String from doc object, and then pass it to Phandom as an argument. Then, on line 13, we call the Phandom.dom() method to get an instance of the class org.w3c.dom.Document.

If our HTML contains any broken JavaScript code, method dom() produces a runtime exception and the unit test fail. If HTML is clean and WebKit is able to render it without problems, the test passes.

I'm using this mechanism in a few different projects,and it works quite well. Therefore, I highly recommend it.

Of course, you shouldn't forget that you must have phantomjs installed on your build machine. In order to avoid unit test failures when phantomjs is not available or present, I've created the following supplementary method:

public class DocumentTest {
  @Test
  public void rendersValidHtml() {
    Assume.assumeTrue(Phandom.installed());
    // the rest of the unit test method body...
  }
}

Enjoy and feel free to report any bugs or problems you encounter to: GitHub issues :)

© Yegor Bugayenko 2014–2018

Movies for Thanasis

© Yegor Bugayenko 2014–2018

First Post

QR code

First Post

This is the first post on my new blog. Therefore, it's not about anything in particular---just an introduction and my way of saying hello. This blog will be primarily about software development ideas. As my About Me page says, I'm passionate about software quality, and will write solely about my ideas and views on it.

Anyway, welcome to my new blog. Together, let's see how this works out! :)

BTW, I purchased the Cambria font just for this new blog. It cost, €98. Nevertheless, I think it's a good investment for this new venture.

© Yegor Bugayenko 2014–2018

D29, a prototype

QR code

D29, a prototype

  • comments

D29 is a prototype of a new programming language and a development platform. Well, actually, not a prototype yet, but just an idea. As it looks to me, the languages we have now (even the most modern ones) are still close to COBOL/C and far from being truly elegant and modern.

Would be great if we can design a language/platform, which will be a mix of object oriented programming and functional programming, and will have all the features listed below, out-of-the-box.

Key principles:

  • everything is an object
  • byte and bytes are the only built-in types
  • strict compile-time static analysis

Native support of:

Maybe native support of:

  • cloud computing

Features:

  • no mutable objects (why?)
  • no public/protected object properties
  • no static properties/methods (why?)
  • no global variables
  • no pointers
  • no enums
  • no NULL (why?)
  • no scalar types, like int, float, etc.
  • no unchecked exceptions (why?)
  • no interface-less classes
  • no implementation inheritance (why?)
  • no operator overloading
  • all methods are either final or abstract
  • no mutability of method arguments
  • no mocking (why?)
  • no reflection
  • no instanceof operator (why?)
  • no root class (like, for example, Object in Java)
  • instant object destruction instead of garbage collection

Maybe:

  • native support of Java classes/libraries
  • compilation into Java byte code

If interested to contribute, email me. Maybe we'll do something together :)

© Yegor Bugayenko 2014–2018

Puzzle Driven Development

QR code

Puzzle Driven Development

badge

PDD, or Puzzle Driven Development, is a method used to break down programming tasks into smaller ones and enable their implementation in parallel. The PDD method is used widely in XDSD methodology. The method is pending a USPTO patent (application no. 12/840,306).

Let's review the method with an example. Say, for instance, you are a programmer and have been tasked to design and implement a Java class. This is the formal task description: "class DBWriter has to extend java.io.Writer abstract class and save all incoming data into the database."

You have one hour to implement this task. It is obvious to you that one hour is not enough because the problem is much bigger and requires more work than the slotted time allows. Additionally, there are a numerous unknowns:

  • What information do we need to save, and in what format?
  • What is the DB schema? Is it an SQL or NoSQL database?
  • How to connect to the DB? JDBC? JPA? DAO?
  • How to handle exceptions?

Let's keep all these unknowns in mind as we try to solve the problem on the highest level of abstraction. Of course, we start with a test:

import org.junit.*;
import static org.mockito.Mockito.*;
public class DBWriterTest {
  @Test
  void testSavesDataIntoDatabase() throws Exception {
    DataBridge mapper = mock(DataBridge.class);
    Writer writer = new DBWriter(mapper);
    try {
      writer.write("hello, world!");
    } finally {
      writer.close();
    }
    verify(mapper).insert("hello, world!");
  }
}

In the above test, we define the expected behavior of the class. The test fails to compile because there are two missing classes: DataBridge and DBWriter. Let's implement the bridge first:

import java.io.IOException;
public interface DataBridge {
  void insert(String text) throws IOException;
}

Next, the writer itself:

import java.io.IOException;
import java.io.Writer;
import java.utils.Arrays;
public class DBWriter implements Writer {
  private DataBridge bridge;
  public DBWriter(DataBridge brdg) {
    this.bridge = brdg;
  }
  @Override
  void flush() throws IOException {
  }
  @Override
  void close() throws IOException {
  }
  @Override
  void write(char[] cbuf, int off, int len) throws IOException {
    String data = new String(Arrays.copyOfRange(cbuf, off, off + len));
    this.bridge.insert(data);
  }
}

Using the above code, we solve the problem. In the example, we successfully designed, implemented and tested the required DBWriter class. Subsequently, the class can now immediately can be used "as is" by other classes.

Of course, the implementation is not finished, since we are not writing anything to the database. Furthermore, we still aren't answering the majority of questions asked in the sample scenario. For instance, we still don't know exactly how the database needs to be connected, its type (SQL or NoSQL,) the correct data format and so on. However, we've already made a number of important architectural assumptions, which allowed us to implement the class and make it usable by other classes.

Now it's time to identify the unknowns in our code and mark them with puzzles. Every puzzle is a request for refinement. We want to ask someone else to help us refine and correct our assumptions. Here is the first puzzle we need to add:

public interface DataBridge {
  /**
   * @todo #123 I assumed that a simple insert() method will be
   *  enough to insert data into the database. Maybe it's
   *  not true, and some sort of transaction support may be
   *  required. We should implement this interface and create
   *  an integration test with a database.
   */
  void insert(String text) throws IOException;
}

The puzzle has three elements: @todo tag, #123 locator and a comment. Locator displays the following: "The puzzle was created while working with ticket #123."

Let’s add one more puzzle:

void write(char[] cbuf, int off, int len) throws IOException {
  // @todo #123 I assumed that the data should be sent to the database
  //  as its received by the writer. Maybe this assumption
  //  is wrong and we should aggregate data into blocks/chunks
  //  and then send them to the data bridge.
  String data = new String(Arrays.copyOfRange(cbuf, off, off + len));
  this.bridge.insert(data);
}

This puzzle indicates one of our concerns because we are not sure that the architectural decision is right. Actually, the design is very primitive at the moment and very likely to be incorrect. To refine it and refactor, we require more information from the task specifier.

The task is finished. Now, you can reintegrate your branch into master and return the ticket to whoever assigned it to you. His task now is to find other people who will be able to resolve the puzzles we just created.

Every puzzle created now will produce other puzzles, which will be resolved by other people. Consequently, our simple one-hour task can potentially generate hundreds of other tasks, which may take days or even years to complete. Nevertheless, your goal of working with your specific task is to finish it as soon as possible and reintegrate your branch into master.

Best Practices

There are a few simple rules that help you to place puzzles correctly.

First, you should put your @todo annotations at the point where your code hits a stub. For example, in a unit test. You're implementing a test and it fails because the class has not yet been implemented. You skip the test with the @Ignore annotation and add a @todo puzzle to its JavaDoc.

Second, your puzzle should remain as near as possible to the code element that is hitting the stub. Say that you have a unit test that has three test methods. All of them fail now because the class has not been implemented. The best approach would be to ignore every one of them and create three (!) puzzles. Each one of the puzzles should explain what you expect from the class and how it should be implemented.

Third, be as descriptive as possible. Your puzzle will soon be a task definition for someone else. So, explain clearly what you expect the next person to implement, how to do it, which documentation to use and so on and so forth. There should be enough information present that the next person assigned to the puzzles is able to implement your required classes without additional input from you!

BTW, puzzle collection process can be automated by means of our PDD Ruby gem and 0pdd.com hosted service.

© Yegor Bugayenko 2014–2018

sixnines availability badge